Quasi-uniform designs with optimal and near-optimal uniformity constant

A design is a collection of distinct points in a given set $X$, which is assumed to be a compact subset of $R^d$, and the mesh-ratio of a design is the ratio of its fill distance to its separation radius. The uniformity constant of a sequence of nested designs is the smallest upper bound for the mesh-ratios of the designs. We derive a lower bound on this uniformity constant and show that a simple greedy construction achieves this lower bound. We then extend this scheme to allow more flexibility in the design construction.


Introduction
Let X be a compact subset of R d , for some d ≥ 1, with vol(X ) > 0. Let • denote a norm, not necessarily the Euclidean norm • 2 , on R d .The ball of radius r and center x is B(x, r) = {x ∈ R d : x − x ≤ r}.The volume of the unit ball B(0, 1) is denoted by V d .If the norm • is Euclidean, then V d = π d/2 /Γ(d/2 + 1).
A collection X n = {x 1 , . . ., x n } of n distinct points in X will be called an n-point design (in the modern literature on approximation theory, designs are often called "data sets", see e.g.[18,23]).We start with several definitions of well-known characteristics of designs.
FD, the fill distance (also known as mesh norm, covering radius, dispersion, or minimax-distance criterion), of the n-point design X n for X is A design X * n,FD will be called FD-optimal if h * n = h(X * n,FD ) = min X n ∈X h(X n ) .SR, the separation radius (also called packing radius or maximin-distance criterion), of X n is q(X n ) = 1 2 min A design X * n,SR will be called SR-optimal if q * n = q(X * n,SR ) = max X n ∈X q(X n ) .The mesh-ratio of X n for X is The mesh-ratio provides a measure of how uniformly points in X n are distributed in X , see e.g.[18, p. 573] and [6, p. 129]; it is sometimes called the uniformity constant of X n , see [4].The mesh-ratio is commonly used to assess the stability of approximations constructed on the base of observations at x i ∈ X n , see e.g.[18] and [23,Chapter 12].According to Guideline 7.10 in [18, p. 579], the best approximation error with the most stable system is achieved by using quasi-uniform designs (data sets) with the smallest mesh-ratio.The mesh-ratio is fundamental in estimation of stability of approximations through the approach involving the Lebesgue constant, see [4,Th. 1] and [9,Sect. 8.5].Moreover, the mesh-ratio plays an important role in the derivation of upper bounds on the quality of kernel approximations in the so-called 'escape theorems', when the approximated function is less smooth than the kernel, see [11,12] as well as [6, Th. 1, p. 129] and [18,Th. 7.8].
Let X ∞ = {x 1 , x 2 , . ..} ⊂ X be a sequence of points in X .There is a one-toone correspondence between such point sequence X ∞ and the sequence {X n } ∞ n=1 of nested designs X n = {x 1 , . . ., Quasi-uniform sequences of designs with small uniformity constants are the main sources of designs (point sets) in the meshless (or "mesh-free") methods of computational mathematics; see e.g.[6,18,23] A sequence X * ∞ = {x * 1 , x * 2 , . ..} will be called MR-optimal if its uniformity constant is minimal: It is well known that when X is connected, MR(X n ) ≥ 1 for any n-point design X n in X (as the n-balls B(x i , CR(X n )) must cover X ).One of the main results of the paper is Theorem 1.1 below, which states that in fact lim sup n→∞ MR(X n ) ≥ 2 for any compact X with positive volume.The proof is rather elementary but the result does not seem to be known.It implies in particular that the classical greedy packing algorithm is MR-optimal.
Theorem 1.1.For any sequence of nested designs X n in a compact set X ⊂ R d with vol(X ) > 0, we have In particular, MR(X ∞ ) ≥ 2 for any X ∞ ⊂ X .
Theorem 1.1 is proved in Section 2. The greedy-packing (or coffee-house) algorithm is presented in Section 3.1; it constructs a sequence X ∞ with MR(X n ) ≤ 2 for all n ≥ 2 and hence MR(X ∞ ) = 2.In Section 3.2, we generalize the greedypacking algorithm to the construction of other quasi-uniform sequences with bounded MR(X ∞ ).In Section 3.3 we use the results of Section 3.2 to establish properties of an implementable version of the greedy-packing algorithm where, at every iteration, the next design point x n+1 is chosen among a finite set of candidates X N ⊂ X rather than within the whole X .In Section 3.4 we consider a boundary-phobic version of greedy packing, which provides designs with worse (larger) mesh-ratio but better (smaller) fill distance.A connection with two greedy kernel-based constructions (energy minimization and the P-greedy algorithm) is presented in Section 3.5.Section 4 briefly concludes.The Matlab scripts used to produce Figures 1 and 2 are available at https://sdb3.i3s.unice.fr/anrindex/fr/node/5.

Proof of Theorem 1.1
Before providing a proof of Theorem 1.1, we prove two simple lemmas, both of them presenting independent interest.Lemma 2.1.For any design X n in a compact set X ⊂ R d , we have Moreover, for any m such that n ≥ m ≥ 2, we have where X 0 = X ⊕ B(0, q(X m )), X m is a sub-design of X n consisting of m points and ⊕ denotes the Minkowski sum.
Proof.The n balls B(x i , h(X n )) cover X ; this yields the first inequality.The second inequality follows from q(X n ) ≤ q(X m ), which implies that all the balls B(x i , q(X n )) are fully inside X 0 (i = 1, . . ., n).Lemma 2.1 has the following consequence concerning the rate of decrease of the fill distance and separation radius of quasi-uniform sequences of nested designs.
Corollary 2.1.For any quasi-uniform sequence of nested designs X n with uniformity constant b in a compact set X ⊂ R d , we have where c 1 and c 2 are some positive constants.
Proof.Since the n balls B(x i , h(X n )) cover X , the pigeon-hole principle implies that at least one of them must contain at least two points x i and x j from X n+1 .Therefore, Proof of Theorem 1.1.Assume that lim sup n→∞ MR(X n ) < 2. This would yield that there exists r < 2 and n 0 such that MR(X n ) ≤ r for all n ≥ n 0 .Consider all such n ≥ n 0 .The definition of h(X n ) and MR(X n ) imply the existence of x j ∈ X n such that Therefore, q(X n+1 ) ≤ (1/2) min This implies the exponential decrease of q(X n ) to zero (as n → ∞), which contradicts (2.1).

Greedy packing
Let us first describe the greedy-packing algorithm (called "geometric greedy method" in [5]), which achieves the lower bound of Theorem 1.1 and hence constructs an MRoptimal sequence of points X ∞ and nested designs {X n } ∞ n=1 .This algorithm is sometimes called the "coffee-house" algorithm, due to the analogy with the behavior of customers in large coffee shops, where new clients tend to seat as far as possible from occupied tables [10].
Algorithm 1 (Greedy packing) For arbitrary x 1 ∈ X and any choice of x n+1 ∈ Arg max x∈X min x i ∈X n x − x i at step 3, the sequence of designs X n constructed by Algorithm 1 satisfies the following property.
Proof.The inequality q(X n ) ≥ h(X n−1 )/2 is proved in [5,Lemma 5.1] by induction on n; the equality is obtained by the same arguments as shown below.
By the definition of x 2 , we have q(X 2 ) = h(X 1 )/2.Assume that q(X n ) = h(X n−1 )/2 and consider q(X n+1 ): Proof.By Lemma 2.2 applied to the designs X n+1 and X * n,FD , we obtain q(X n+1 From Lemma 2.2 applied to the designs X * n+1,SR and X n and Lemma 3.1, we obtain Theorem 3.2 may be deduced from Theorem 2.2 in [7], where Algorithm 1 is used to minimize the maximum intercluster distance; see also [8,Theorem 4.3].Theorem 3.2 also follows from Theorem 3.6 below.However, we think that the proof provided above is interesting in itself, as the important role of Lemma 3.1 uncovers the key property of Algorithm 1.
Note that in Theorem 3.2 the choice of the norm in X is irrelevant.Moreover, X does not have to be a subset of R d ; in particular, X can be a discrete set as in the clustering problems considered in [7].
While the calculation of q(X n ) is straightforward, h(X n ) is difficult to compute when X is a continuous set.Methods of computational geometry can sometimes be used [15], but are restricted to low-dimensional spaces.The substitution of a finite set X N for X , with the N points of X N suitably well spread over X , is often used in practice; see Section 3.3 for the analysis of this version of Algorithm 1.
For d = 1 and X = [0, 1], Algorithm 1 initialized at x 1 = 1/2 is equivalent to the celebrated van der Corput sequence in base 2 in terms of the behaviour of h(X n ), q(X n ) and MR(X n ); see [13, p. 25].The regular pattern of MR(X n ) observed in dimension 1 extends to dimension 2 with X = [0, 1] 2 when • = • 2 and the algorithm is initialized at the center (1/2, 1/2).This is illustrated on the left panel of Figure 1: MR(X n ) takes two values only, 2 and √ 2. The detailed behaviour of the algorithm is as follows.
Then the packing and covering performance of Algorithm 1 with where For the sake of brevity, we only give a sketch of the full proof.It is based on the self-replicating pattern of the construction.The first five points in X = [0, 1] 2 correspond to the corners and the center of the square.This gives the initialization for the beginning of the initial cycle, indexed by m = 0, with m denoting the cycle number.Define the initialization of cycle m as the replication of the initial design of cycle 0 into 4 m squares of side length γ m = 2 −m , which form a regular partition of [0, 1] 2 .The initial design for cycle m has thus n m = (2 m + 1) 2 + 4 m = 2 2m+1 + 2 m+1 + 1 points: (2 m + 1) 2 of them form a regular grid of width γ m (i.e., a (2 m + 1) 2 full factorial design); the other 4 m points are the centers of the small squares.When moving to the next cycle, the algorithm first (i) adds the midpoints of the sides of all small squares (in arbitrary order), then (ii) adds the 4 m+1 centers of the smaller squares created at previous phase.The number of points added during phase (i) where The proof is omitted.Similarly to the 2-dimensional case treated in Theorem 3.3, the construction follows a self-replicating pattern.The first 17 points in X = [0, 1] 4 are the 16 vertices and the center of X .This gives the initialization for the beginning of the cycle m = 0, which consists of the following two stages: (i) the algorithm chooses (in arbitrary order) all points with two coordinates equal to 1/2 and the other two coordinates in {0, 1}; there are 2 2 × 4 2 = 24 such points; (ii) the algorithm chooses (in arbitrary order) points with one coordinate 1/2 and the other three in {0, 1} (there are 2 3 × 4 1 = 32 such points), points with three coordinates 1/2 and one in {0, 1} (there are 2 × 4  3 = 8 such points) and points with coordinates in {1/4, 3/4} (there are 16 such points).
The initialization of cycle m is defined as the replication of the initial design of cycle 0 into 2 4m hypercubes of side length γ m = 2 −m , which form a regular partition of [0, 1] 4 .The initial design for cycle m has thus n m = (2 m + 1) 4 + 2 4m points: (2 m + 1) 4  of them form a regular grid of width γ m ; the other 2 4m points are the centers of the small hypercubes.We thus have 2 4m replications of the initial 17-point initial design, but in smaller hypercubes.In each of them, the selections made by the algorithm are similar to those of the cycle m = 0.
From the description above, we can observe that the design X n m re-scaled by a factor 2 m gives the integer lattice Z 4 truncated to (i 1 , i 2 , i 3 , i 4 ) ∈ {0, . . ., 2 m } 4 .Moreover, when n = n m + m , the design X n re-scaled by 2 m+1 gives the so-called checkerboard lattice D 4 (the subset of the integer lattice Z 4 consisting of quadruples whose sum is even), truncated to (i 1 , i 2 , i 3 , i 4 ) ∈ {0, . . ., 2 m+1 } 4 ; note that D 4 is the densest packing lattice in the 4-dimensional space [3, p. 9].
The regular behaviour of Algorithm 1 observed for d = 1, and d = 2 and 4 where the properly re-scaled design X n oscillates between the integer point lattice and the checkerboard lattice, does not hold for other dimensions d.

Relaxed greedy packing
We consider now a generalization of Algorithm 1, where the next point at a given iteration is not necessarily the furthest away from current design points, but is guaranteed to be far enough from them.The bounds obtained in Theorem 3.6 are worse than those in Theorem 3.2; however, it can be shown that the relaxation introduced may improve the covering properties of the design sequence generated; see Section 3.4.
At step 3, the choice of x n+1 is arbitrary provided it satisfies the condition indicated.Due to this flexibility, several existing algorithms form particular cases of Algorithm 2, which in fact defines a whole family of algorithms.In particular, one may first select x * ∈ Arg max x∈X min x i ∈X n x − x i and then take any point x n+1 ∈ B(x * , (1 − α n )h(X n )).Random designs with guaranteed covering and packing performance can easily be generated in this way: for instance, take x n+1 = (1 − α n )x * n + α n x * with α n a random variable (e.g., uniform) in [a, 1], a > 0.
Theorem 3.6.For all n ≥ 2, the designs X n generated by any version of Algorithm 2 satisfy Proof.We first prove by induction that for all n ≥ 2, q(X n ) ≥ (a/2) h(X n−1 ).

The inequality proved by induction implies
Next, by Lemma 2.2, h(X n−1 ) ≥ q * n , and therefore q(X n ) ≥ (a/2) h(X n−1 ) ≥ (a/2) q * n .The same lemma implies h * n−1 ≥ q(X n ) ≥ (a/2) h(X n−1 ).Theorem 3.2 follows from Theorem 3.6 by taking a = 1.As in Theorem 3.2, the choice of the norm in X is irrelevant and X does not have to be a subset of R d .
The following property is an extension of Theorem 3.6 to the situation where lim inf n→∞ α n = a ∈ (0, 1] in Algorithm 2 (i.e., not all α i are bounded from below by a).Theorem 3.7.Suppose that in Algorithm 2 the choice of x n+1 at Step 3 is such that the scalars Proof.As lim inf n→∞ α n = a ∈ (0, 1], for all ε > 0, there exists n 0 such that α n > a − ε for all n ≥ n 0 .We first prove that there exists an and thus q(X n+1 ) = q(X n ) for all n ≥ n 0 .As q(X n 0 ) ≥ (α/2) q * n 0 > 0 from Theorem 3.2, this is in contradiction with Lemma 2.1 which states that q(X n ) ≤ Cn −1/d for some C > 0.
In the next section, Theorem 3.6 is used for assessing properties of an easily implementable version of Algorithm 1, where x n+1 at step 3 is chosen from a finite set.

Greedy packing for a finite candidate set
Consider a version of Algorithm 1 where x n+1 is chosen among a finite set of candidates X N ⊂ X rather than from the whole X .This assumption makes the implementation of Algorithm 1 much simpler but naturally deteriorates its performance.Such implementation of Algorithm 1 can be considered as a special case of Algorithm 2, and hence, as we show below in Theorem 3.9, its performance over entire X can be assessed.Note that the total number of iterations must be smaller than N, the number of candidate points: indeed, for n ≥ N, the algorithm degenerates as several points necessarily coincide in X N+ j , j ≥ 1. Lemma 3.8.For any n-point design X n and any N-point set X N ⊂ X we have Theorem 3.9.When Algorithm 1 uses a finite set of candidates X N ⊂ X and n < N, its performance satisfies with At step 3 of Algorithm 1, we have min ) is non-increasing with n, α n is non-increasing too (it reaches zero when X n has exhausted X N , that is, when k = N).Theorem 3.6 with α n substituted for a implies (3.1).
As we do not know h X (X n ) and thus α n , we can use the inequality h The inequalities (3.1) then remain true with a n substituted for α n , as long as a n > 0.
A result similar to Theorem 3.1 holds when the performance of Algorithm 1 is evaluated on a finite set X N ⊃ X N instead of X : we simply substitute X N for X and

Boundary-phobic greedy packing
Versions of the greedy packing algorithm that enforce boundary avoidance have been proposed in [14,19].There, at iteration n ≥ 2, the next point x n+1 is chosen in Arg max x∈X D β (x, X n , X ), where with d(x, ∂ X ) the distance from X to the boundary of X .Note that this quantity is easily determined if X has a simple shape, like a hypercube or a ball, but may be difficult to evaluate otherwise.For β = ∞, we define D ∞ (x, X n , X ) = min x i ∈X n x−x i by continuity; the algorithm then coincides with Algorithm 1.For β = 1, x n+1 is the center of (one of) the largest ball included in X and not intersecting X n .For β = 2, the algorithm corresponds to a greedy method for the solution of the traditional packing problem, for which the n balls do not intersect and are constrained to be fully inside X .For β > 2, the larger β is, the more the balls are allowed to overshoot X , with their centers remaining inside X .When X = [0, 1] d and • = • 2 , the value β = 2 √ 2d is recommended in [19], while [14] recommends to let β depend on the targeted number n max of design points and suggests taking Proof.Let X = [0, 1] d and β ∈ (0, ∞), r n = D β (x n+1 , X n , X ) = max x∈X D β (x, X n , X ).Any x ∈ X satisfies at least one of the two inequalities min Since, by definition, r n ≤ min x i ∈X n x n+1 − x i , we have min and the algorithm is a particular instance of Algorithm 2 with α n = a = 1/(1 + √ d/β ).
Theorem 3.10 implies that the performance of this algorithm satisfies the bounds indicated in Theorem 3.6.  1 shows that boundary avoidance has significantly reduced h(X n ).This reduction is obtained at the detriment of MR(X n ) for some X n , as illustrated by the left panels of the two figures (note, however, that MR(X 80 ) < 2 on Figure 2).
associated with an n-point design X n = {X 1 , . . ., x n }.A design X * n ⊂ X that minimizes E K s (X n ) is called a set of s-Fekete points (the original denomination is for X being the sphere S 2 ); see, e.g., [2, Chap.2] for a thorough exposition of the discrete energy problem and [2, Chap.4] for the connection with the continuous energy problem.In particular, [2, Prop. 2. 1.1] shows that E K s (X * n ) is non decreasing in n; its limit is called the transfinite diameter of X and coincides with the Wiener constant, i.e., the minimum value of the continuous energy on X [2, Th. 4.2.2].Next theorem shows the strong connection that exists between greedy energy minimization for the Riesz kernel and greedy packing.
Theorem 3.12.The designs X n obtained by greedy minimization of the energy E K s , i.e., such that By letting s = s n vary at each iteration in K s in such a way that s n / log n → ∞ as n → ∞, we get Greedy energy minimization (with the Riesz kernel K s ) thus corresponds to a particular version of relaxed greedy packing (Algorithm 2), where α n = n −1/s decreases with n, and Theorem 3.6 implies (3.3).
The covering and packing efficiencies of X n may degrade as n increases, but if we let s = s n vary at each iteration in K s in such a way that n 1/s n → 1 (or equivalently, s n / log n → ∞) as n → ∞, then Theorem 3.7 implies (3.4).
Similar developments with the isotropic Matérn 1/2 kernel with correlation length , K 1/2, (x, x ) = exp(− x − x / ), show that min i x n+1 − x i ≥ h(X n ) − log n for greedy energy minimization.Since h(X n ) ≥ Cn −1/d for some C > 0 from Lemma 2.1, by letting = n decrease with n in such a way that n 1/d n log n → 0 as n → ∞, we also get (3.4) from Theorem 3.7.
An advantage of this type of construction over Algorithm 1 is that the choice of x n+1 accounts for the location of all previous x i , i ≤ n, and hence is generally uniquely defined, whereas there are often several equivalent choices at step 3 of Algorithm 1 (see for example the regular patterns explained in Theorems 3.3 and 3.5).optimal from Lemma 2.1).The same theorem shows that any relaxation of the P-greedy algorithm that selects an arbitrary x n+1 in the set {x ∈ X : P n (x) ≥ γ max x∈X P n (x)} (γ ∈ (0, 1]) also achieves the rate of decrease n −1/d for h(X n ), and Theorem 19 in the same paper shows that q(X n ) > c n −1/d for some c > 0 (so that the rate of decrease of q(X n ) is optimal too) when τ > d/2 + 1 and X satisfies an interior cone condition and has a Lipschitz boundary.As for energy minimization (Section 3.5.1),an advantage of this type of construction over Algorithm 1 is that the choice of x n+1 accounts for the location of all previous x i .

Conclusion
By introducing a relaxation in the classical greedy-packing algorithm, we have proposed a general class of simple greedy algorithms that generate quasi-uniform sequences of nested designs with guaranteed packing and covering performance.We have shown that the value 2 of the uniformity constant of the greedy-packing algorithm is optimal, and that it can be attained by a relaxed algorithm whose relaxation vanishes asymptotically.A connection with two kernel-based greedy constructions has been evidenced.

Example 3 . 11 .
We take X = [0, 1] 2 , • = • 2 and β = 4.The left panel of Figure 2 shows the evolution of MR(X n ) as a function of n = 2, . . ., 80 when X n is generated by x n+1 ∈ Arg max x∈X D β (x, X n , X ) with x 1 = (1/2, 1/2); the upper bound 2(1 + √ d/β ) on MR(X n ) is indicated by a horizontal line.The right panel presents X 80 : comparison with the right panel of Figure

− 1
/s and therefore, forx * ∈ Arg max x∈X min i x − x i , min i x n+1 − x i ≥ n −1/s min i x * − x i = h(X n ) .