On proportional volume sampling for experimental design in general spaces

Poinas, Arnaud; Bardenet, Rémi

doi:10.1007/s11222-022-10115-0

On proportional volume sampling for experimental design in general spaces

Published: 31 December 2022

Volume 33, article number 29, (2023)
Cite this article

Statistics and Computing Aims and scope Submit manuscript

179 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Optimal design for linear regression is a fundamental task in statistics. For finite design spaces, recent progress has shown that random designs drawn using proportional volume sampling (PVS for short) lead to polynomial-time algorithms with approximation guarantees that outperform i.i.d. sampling. PVS strikes the balance between design nodes that jointly fill the design space, while marginally staying in regions of high mass under the solution of a relaxed convex version of the original problem. In this paper, we examine some of the statistical implications of a new variant of PVS for (possibly Bayesian) optimal design. Using point process machinery, we treat the case of a generic Polish design space. We show that not only are known A-optimality approximation guarantees preserved, but we obtain similar guarantees for D-optimal design that tighten recent results. Moreover, we show that our PVS variant can be sampled in polynomial time. Unfortunately, in spite of its elegance and tractability, we demonstrate on a simple example that the practical implications of general PVS are likely limited. In the second part of the paper, we focus on applications and investigate the use of PVS as a subroutine for stochastic search heuristics. We demonstrate that PVS is a robust addition to the practitioner’s toolbox, especially when the regression functions are nonstandard and the design space, while low-dimensional, has a complicated shape (e.g., nonlinear boundaries, several connected components).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Near-optimal discrete optimization for experimental design: a regret minimization approach

Article 10 January 2020

Zeyuan Allen-Zhu, Yuanzhi Li, … Yining Wang

A Laplace-based algorithm for Bayesian adaptive design

Article 05 April 2020

S. G. J. Senarathne, C. C. Drovandi & J. M. McGree

Variance reduction in Monte Carlo sampling-based optimality gap estimators for two-stage stochastic linear programming

Article 21 December 2015

Rebecca Stockbridge & Güzin Bayraksan

Notes

Python code available at https://github.com/APoinas/Optimal-design-in-continuous-space

References

Andersen, M., Dahl, J., Liu, Z., Vandenberghe, L.: Interior-point methods for large-scale cone programming. In: Sra, S., Nowozin, S., Wright, S. (eds.) Optimization for Machine Learning, MIT Press, chap 1, pp. 55–83 (2012)
Atkinson, A., Donev, A., Tobias, R.: Optimum Experimental Designs, with SAS. Oxford University Press, Oxford Statistical Science Series, Oxford (2007)
MATH Google Scholar
Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, USA (2004)
Book MATH Google Scholar
Collings, B.J.: Characteristic polynomials by diagonal expansion. Am. Stat. 37(3), 233–235 (1983)
Google Scholar
Daley, D.J., Vere-Jones, D.: An Introduction to the Theory of Point Processes, vol. I, 2nd edn. Springer, New York (2003)
MATH Google Scholar
De Castro, Y., Gamboa, F., Henrion, D., Hess, R., Lasserre, J.: Approximate optimal designs for multivariate polynomial regression. Ann. Stat. 47(1), 127–155 (2019)
Article MathSciNet MATH Google Scholar
Dereziński, M., Warmuth, M., Hsu, D.: Leveraged volume sampling for linear regression. In: Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems, pp. 2510–2519 (2018)
Dereziński, M., Warmuth, M., Hsu, D.: Unbiased estimators for random design regression. ArXiv pre-print (2019)
Dereziński, M., Liang, F., Mahoney, M.: Bayesian experimental design using regularized determinantal point processes. In: Chiappa, S., Calandra, R. (Eds.) Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR, Online, Proceedings of Machine Learning Research, Vol. 108, pp. 3197–3207 (2020)
Dereziński, M., Mahoney, M.: Determinantal point processes in randomized numerical linear algebra. Not. Am. Math. Soc. 68, 1 (2021)
Article MathSciNet MATH Google Scholar
Dette, H.: Bayesian d-optimal and model robust designs in linear regression models. Stat.: J. Theor. Appl. Stat. 25(1), 27–46 (1993)
Article MathSciNet MATH Google Scholar
Dette, H., Studden, W.J.: The Theory of Canonical Moments with Applications in Statistics, Probability, and Analysis. Wiley Series in Probability and Statistics, Wiley, Hoboken (1997)
MATH Google Scholar
Dette, H., Melas, V., Pepelyshev, A.: D-optimal designs for trigonometric regression models on a partial circle. Ann. Inst. Stat. Math. 54, 945–959 (2002)
Article MathSciNet MATH Google Scholar
Dick, J., Pilichshammer, F.: Digital Nets and Sequences. Discrepancy Theory and Quasi-Monte Carlo Integration. Cambridge University Press, Cambridge (2010)
Book Google Scholar
Fang, K., Li, R., Sudjianto, A.: Design and modeling for computer experiments. Computer science and data analysis series, 1st edn. Chapman and Hall/CRC, Boca Raton (2006)
MATH Google Scholar
Farrell, R.H., Kiefer, J., Walbran, A.: Optimum multivariate designs. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, 1: Statistics, University of California Press, pp. 113–138 (1967)
Fedorov, V.: Theory of Optimal Experiments Designs. Academic Press, New York (1972)
Google Scholar
Gautier, G., Bardenet, R., Valko, M.: On two ways to use determinantal point processes for Monte Carlo integration. Tech. rep., ICML workshop on Negative dependence in machine learning (2019a)
Gautier, G., Polito, G., Bardenet, R., Valko, M.: DPPy: DPP Sampling with Python. Journal of Machine Learning Research - Machine Learning Open Source Software (JMLR-MLOSS) (2019b)
Grove, D., Woods, D., Lewis, S.: Multifactor b-spline mixed models in designed experiments for the engine mapping problem. J. Qual. Technol. 36(4), 380–391 (2004)
Hough, J., Krishnapur, M., Peres, Y., Virag, B.: Zeros of Gaussian Analytic Functions and Determinantal Point Processes. American Mathematical Society, Providence (2009)
Book MATH Google Scholar
Hough, JB., Krishnapur, M., Peres, Y., Virág, B.: Determinantal processes and independence. Probability surveys (2006)
Johansson, K.: Random matrices and determinantal processes. Les Houches Summer School Proc. 83(C), 1–56 (2006)
MathSciNet MATH Google Scholar
Kulesza, A., Taskar, B.: Determinantal point processes for machine learning. Foundations and Trends in Machine Learning (2012)
Lavancier, F., Møller, J., Rubak, E.: Determinantal point process models and statistical inference. J. R. Stat. Soc.: Ser. B (Stat. Methodol.) 77, 853–877 (2015)
Article MathSciNet MATH Google Scholar
Liski, E., Mandal, N., Shah, K., Sinha, Ba.: Topics in Optimal Design, 1st edn. Lecture Notes in Statistics 163, Springer, New York (2002)
Liu, X., Yue, R.X., Chatterjee, K.: Geometric characterization of d-optimal designs for random coefficient regression models. Statist. Probab. Lett. 159, 108696 (2020)
Article MathSciNet MATH Google Scholar
Macchi, O.: The coincidence approach to stochastic point processes. Adv. Appl. Probab. 7, 83–122 (1975)
Article MathSciNet MATH Google Scholar
Maronge, J., Zhai, Y., Wiens, D., Fang, Z.: Optimal designs for spline wavelet regression models. J. Stat. Plann. Inference 184, 94–104 (2017)
Article MathSciNet MATH Google Scholar
Nikolov, A., Singh, M., Tantipongpipat, UT.: Proportional volume sampling and approximation algorithms for a-optimal design. In: Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, Society for Industrial and Applied Mathematics, SODA ’19, 1369–1386 (2019)
Piepel, G., Stanfill, B., Cooley, S., Jones, B., Kroll, J., Vienna, J.: Developing a space-filling mixture experiment design when the components are subject to linear and nonlinear constraints. Qual. Eng. 31(3), 463–472 (2019). https://doi.org/10.1080/08982112.2018.1517887
Article Google Scholar
Pronzato, L., Pázman, A.: Design of Experiments in Nonlinear Models: Asymptotic Normality, Optimality Criteria and Small-Sample Properties. Lecture Notes in Statistics, vol. 212. Springer-Verlag, New York (2013)
MATH Google Scholar
Pukelsheim, F.: Optimal Design of Experiments. Classics in applied mathematics 50, Society for Industrial and Applied Mathematics (2006)
Pukelsheim, F., Rieder, S.: Efficient rounding of approximate designs. Biometrika 79(4), 763–770 (1992)
Article MathSciNet Google Scholar
Robert, C.P., Casella, G.: Monte Carlo Statistical Methods. Springer, New York (2004)
Book MATH Google Scholar
Summa, M., Eisenbrand, F., Faenza, Y., Moldenhauer, C.: On largest volume simplices and sub-determinants. In: Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms 2015 (2014)
Virtanen, P., Gommers, R., Oliphant, T., et al.: SciPy 1.0: fundamental algorithms for scientific computing in python. Nat. Methods 17, 261–272 (2020)
Article Google Scholar
Woods, D., Lewis, S., Dewynne, J.: Designing experiments for multi-variable b-spline models. Sankhya 65, 660–670 (2003)
MathSciNet MATH Google Scholar

Download references

Acknowledgements

We thank Adrien Hardy for useful discussions throughout the project. We thank Michał Dereziński for his insightful comments and suggestions on an early draft. We acknowledge support from ERC grant Blackjack (ERC-2019-STG-851866) and ANR AI chair Baccarat (ANR-20-CHIA-0002).

Author information

Authors and Affiliations

Université de Lille CNRS, Centrale Lille, UMR 9189 - CRIStAL, 59651, Lille, Villeneuve d’Ascq, France
Arnaud Poinas & Rémi Bardenet

Authors

Arnaud Poinas
View author publications
You can also search for this author in PubMed Google Scholar
Rémi Bardenet
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Arnaud Poinas.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

A Proof of the well-definedness of Definition 3.1

It is obvious that the Janossy densities are positive. Therefore, in order to prove that proportional volume sampling is well defined; see Daley and Vere-Jones (2003, Proposition 5.3.II.(ii)), we only need to show that

$$\begin{aligned} \sum _{n\ge 0}^\infty \frac{1}{n!}\int _{\varOmega ^n}j_n(x)\textrm{d}{\nu ^n}(x)=1. \end{aligned}$$

(A.1)

We write the eigenvalues of $\varLambda $ as $\lambda _1\le \cdots \le \lambda _p$ and the spectral decomposition of $\varLambda $ as $\varLambda =P^T D_\lambda P$, where $D_\lambda $ is the $p\times p$ diagonal matrix with the $\lambda _i$ as its diagonal entries. Then, we define the functions $\psi _i$, $1\le i\le p$, by the linear transform of the function $\phi _i$ defined by $(\psi _1(x),\cdots ,\psi _p(x)):=(\phi _1(x),\cdots ,\phi _p(x))P^T$. Finally, we have the decomposition

$$\begin{aligned} \det (\phi (x)^T\phi (x)+\varLambda )&=\det (P\phi (x)^T\phi (x)P^T+D_\lambda )\\&=\det (\psi (x)^T\psi (x)+D_\lambda )\\&=\sum _{S\subset [p]}\lambda ^{S^c}\det (\psi _S(x)^T\psi _S(x)) \end{aligned}$$

where $\psi _S:=(\psi _{S_1},\cdots ,\psi _{S_{|S|}})$ and $\lambda ^{S^c}:=\prod _{i\notin S}\lambda _i$, with the usual convention $\lambda ^{\emptyset }=1$; see Collings (1983). Now, by the discrete Cauchy–Binet formula,

$$\begin{aligned} \det (\psi _S(x)^T\psi _S(x))=\sum _{\begin{array}{c} T\subset [k] \\ |T|=|S| \end{array}}\det (\psi _S(x_T))^2 \end{aligned}$$

where $x_T:=(x_{T_1},\cdots ,x_{T_{|T|}})$. And, by using the more general Cauchy-Binet formula (Johansson 2006), we get

$$\begin{aligned} \int _{\varOmega ^n}\det (\psi _S(x_T))^2\textrm{d}\nu ^n(x)=|T|!\det (G_\nu (\psi _S))\nu (\varOmega )^{n-|T|}. \end{aligned}$$

Therefore

$$\begin{aligned}&\sum _{n\ge 0}^\infty \frac{1}{n!} \int _{\varOmega ^n}\det (\phi (x)^T\phi (x)+\varLambda )\textrm{d}\nu ^n(x) \\ =&\sum _{n\ge 0}^\infty \frac{1}{n!}\sum _{S\subset [p]}\lambda ^{S^c}\int _{\varOmega ^n}\det (\psi _S(x)^T\psi _S(x))\textrm{d}\nu ^n(x)\\ =&\sum _{n\ge 0}^\infty \frac{1}{n!}\sum _{S\subset [p]}\lambda ^{S^c}\sum _{\begin{array}{c} T\subset [k] \\ |T|=|S| \end{array}}|T|!\det (G_\nu (\psi _S))\nu (\varOmega )^{n-|T|} \\ =&\sum _{n\ge 0}^\infty \frac{1}{n!}\sum _{S\subset [p]}\lambda ^{S^c}\left( {\begin{array}{c}n\\ |S|\end{array}}\right) |S|!\det (G_\nu (\psi _S))\nu (\varOmega )^{n-|S|} \\ =&\sum _{n\ge 0}^\infty \sum _{S\subset [p]}\frac{\nu (\varOmega )^{n-|S|}}{(n-|S|)!}\lambda ^{S^c}\det (G_\nu (\psi _S))\mathbbm {1}_{n\ge |S|} \\ =&\sum _{S\subset [p]}\lambda ^{S^c}\det (G_\nu (\psi _S))\sum _{n\ge |S|}^\infty \frac{\nu (\varOmega )^{n-|S|}}{(n-|S|)!}\\ =&\det (G_\nu (\psi )+D_\lambda )\exp (\nu (\varOmega )) \\ =&\det (G_\nu (\phi )+\varLambda )\exp (\nu (\varOmega )) \end{aligned}$$

where, in the last two identities, we used the facts that (i) $G_\nu (\psi _S)$ is equal to $G_\nu (\psi )_S$, the submatrix of $G_\nu (\psi )$ whose rows and columns are indexed by S, and (ii) $G_\nu (\psi )=G_\nu (\phi P^T)=PG_\nu (\psi )P^T$. This proves (A.1).

B Proof of Proposition 3.2

First, we write

$$\begin{aligned}{} & {} \mathbb {E}\left[ (\phi (X)^T\phi (X)+\varLambda )^{-1}\right] \\{} & {} \quad =\sum _{n\ge 0}\frac{1}{n!}\int _{\varOmega ^n}(\phi (x)^T\phi (x)+\varLambda )^{-1}j_n(x)\textrm{d}{\nu }^n(x). \end{aligned}$$

Since $(\phi (x)^T\phi (x)+\varLambda )^{-1}\det (\phi (x)^T\phi (x)+\varLambda )$ is the adjugate matrix of $(\phi (x)^T\phi (x)+\varLambda )$, its (i, j) entry is

$$\begin{aligned} (-1)^{i+j}\det (\phi _{-j}(x)^T\phi _{-i}(x)+\varLambda _{-j,-i}), \end{aligned}$$

where we define $\varLambda _{-j,-i}$ as the matrix $\varLambda $ with its jth row and ith column removed, and $\phi _{-i}(x)$ as the vector $\phi (x)$ with its ith entry removed. Therefore, the (i, j) entry of the matrix $\mathbb {E}\left[ (\phi (X)^T\phi (X)+\varLambda )^{-1}\right] $ is

$$\begin{aligned} \sum _{n\ge 0}\frac{1}{n!}\int _{\varOmega ^n}\frac{(-1)^{i+j}\det (\phi _{-j}(x)^T\phi _{-i}(x)+\varLambda _{-j,-i})}{\det (G_\nu (\phi )+\varLambda )\exp (\nu (\varOmega ))}\textrm{d}\nu ^n(x). \end{aligned}$$

Using the same reasoning as in the proof of normalization in Sect. A, we get that

$$\begin{aligned}{} & {} \sum _{n\ge 0}\frac{1}{n!}\int _{\varOmega ^n}\det (\phi _{-j}(x)^T\phi _{-i}(x)+\varLambda _{-j,-i})\textrm{d}\nu ^n(x)\nonumber \\{} & {} \quad =\det \left( (\langle \phi _a,\phi _b\rangle )_{\begin{array}{c} 1\le a,b\le p \\ a\ne j, b\ne i \end{array}}+\varLambda _{-j,-i}\right) \exp (\nu (\varOmega )). \end{aligned}$$

(B.1)

Note that the proof in Sect. A does not rely on any symmetricity argument, so that identity (B.1) can be proved in the same way. As a consequence, we get that

$$\begin{aligned}{} & {} \sum _{n\ge 0}\frac{1}{n!}\int _{\varOmega ^n}\frac{(-1)^{i+j}\det (\phi _{-j}(x)^T\phi _{-i}(x)+\varLambda _{-j,-i})}{\det (G_\nu (\phi )+\varLambda )\exp (\nu (\varOmega ))}\textrm{d}\nu ^n(x)\\{} & {} \quad =\frac{(-1)^{i+j}\Delta _{j,i}(G_\nu (\phi )+\varLambda )}{\det (G_\nu (\phi )+\varLambda )}, \end{aligned}$$

which is the (i, j) entry of the inverse matrix of $G_\nu (\phi )+\varLambda $. This proves identity (3.2).

Finally, the proof of identity (3.3) is straightforward:

$$\begin{aligned}&\mathbb {E}\left[ \det (\phi (X)^T\phi (X)+\varLambda )^{-1}\right] \\&\quad =\sum _{n\ge 0}\frac{1}{n!}\int _{\varOmega ^n}\frac{1}{\det (G_\nu (\phi )+\varLambda )\exp (\nu (\varOmega ))}\textrm{d}\nu ^n(x)\\&\quad =\frac{\exp (\nu (\varOmega ))}{\det (G_\nu (\phi )+\varLambda )\exp (\nu (\varOmega ))}\\&\quad =\det (G_\nu (\phi )+\varLambda )^{-1}. \end{aligned}$$

C Proof of Proposition 3.3

By definition of the Janossy densities, we have

$$\begin{aligned}{} & {} \mathbb {E}\left[ \det (\phi (X)^T\phi (X)+\varLambda )^{-1}\big | |X|=k\right] \nonumber \\{} & {} \quad =\frac{\frac{1}{k!}\int _{\varOmega ^k}j_k(x)\det (\phi (x)^T\phi (x)+\varLambda )^{-1}\textrm{d}{\nu }^k(x)}{\frac{1}{k!}\int _{\varOmega ^k}j_k(x)\textrm{d}{\nu }^k(x)}. \end{aligned}$$

(C.1)

The integral in the numerator simplifies to

$$\begin{aligned}&\int _{\varOmega ^k}j_k(x)\det (\phi (x)^T\phi (x)+\varLambda )^{-1}\textrm{d}{\nu }^k(x)\nonumber \\&\quad =\int _{\varOmega ^k}\frac{1}{\det (G_\nu (\phi )+\varLambda )\exp (\nu (\varOmega ))}\textrm{d}\nu ^k(x)\nonumber \\&\quad =\frac{\nu (\varOmega )^k}{\exp (\nu (\varOmega ))}\det (G_\nu (\phi )+\varLambda )^{-1}. \end{aligned}$$

(C.2)

As for the denominator of (C.1), following the lines of Sect. A leads to

$$\begin{aligned} \frac{1}{k!}\int _{\varOmega ^k}j_k(x)\textrm{d}{\nu }^k(x)=\frac{\sum _{S\subset [p]}\lambda ^{S^c}\det (G_\nu (\psi _S))\frac{\nu (\varOmega )^{k-|S|}}{(k-|S|)!}}{\det (G_\nu (\phi )+\varLambda )\exp (\nu (\varOmega ))}, \end{aligned}$$

(C.3)

where the $\psi $ functions are defined the same way as in Sect. A. Recalling that

$$\begin{aligned} \sum _{S\subset [p]}\lambda ^{S^c}\det (G_\nu (\psi _S))=\det (G_\nu (\phi )+\varLambda ), \end{aligned}$$

we can rewrite the sum in (C.3) as

$$\begin{aligned}{} & {} \sum _{S\subset [p]}\lambda ^{S^c}\det (G_\nu (\psi _S))\frac{\nu (\varOmega )^{k-|S|}}{(k-|S|)!}\\{} & {} \quad =\frac{\nu (\varOmega )^{k-p}}{(k-p)!}\det (G_\nu (\phi )+\varLambda )\\{} & {} \qquad +\sum _{\begin{array}{c} S\subset [p] \\ S\ne [p] \end{array}}\lambda ^{S^c}\det (G_\nu (\psi _S))\left( \frac{\nu (\varOmega )^{k-|S|}}{(k-|S|)!}-\frac{\nu (\varOmega )^{k-p}}{(k-p)!}\right) . \end{aligned}$$

Now, since $\nu (\varOmega )=k$, the sequence $i\mapsto \nu (\varOmega )^i/i!$ is increasing when $i\le k$. Hence, for all $S\subset [p]$ such that $S\ne [p]$,

$$\begin{aligned}{} & {} \frac{\nu (\varOmega )^{k-|S|}}{(k-|S|)!}-\frac{\nu (\varOmega )^{k-p}}{(k-p)!}\ge \\{} & {} \quad \frac{\nu (\varOmega )^{k-p+1}}{(k-p+1)!}-\frac{\nu (\varOmega )^{k-p}}{(k-p)!}=\frac{k^{k-p}}{(k-p)!}\times \frac{p-1}{k-p+1}. \end{aligned}$$

We thus obtain

$$\begin{aligned}{} & {} \sum _{S\subset [p]}\lambda ^{S^c}\det (G_\nu (\psi _S))\frac{\nu (\varOmega )^{k-|S|}}{(k-|S|)!}\nonumber \\{} & {} \quad \ge \frac{k^{k-p}}{(k-p)!}\left( \det (G_\nu (\phi )+\varLambda )+\right. \nonumber \\{} & {} \qquad \left. \frac{p-1}{k-p+1}\big (\det (G_\nu (\phi )+\varLambda )-\det (G_\nu (\phi ))\big )\right) . \end{aligned}$$

(C.4)

Finally, combining (C.1), (C.2), (C.4) and the fact that $\nu (\varOmega )=k$, we get

$$\begin{aligned}&\mathbb {E}\left[ \det (\phi (X)^T\phi (X){+}\varLambda )^{-1}\big | |X|{=}k\right] \\&\quad \le \frac{\frac{k^k}{k!\exp (k)}\det (G_\nu (\phi ){+}\varLambda )^{-1}}{\frac{k^{k-p}}{(k-p)!\exp (k)}\left( 1{+}\frac{p{-}1}{k-p{+}1} \big (1{-}\det (G_\nu (\phi )(G_\nu (\phi ){+}\varLambda )^{-1})\!\big )\!\right) }\\&\quad = \frac{k^p(k-p)!}{k!}\frac{\det (G_\nu (\phi )+\varLambda )^{-1}}{1+\frac{p-1}{k-p+1}\big (1-\det (G_\nu (\phi )(G_\nu (\phi )+\varLambda )^{-1})\big )}, \end{aligned}$$

concluding the proof.

D Proof of Proposition 3.4

Using the convexity of $x\mapsto 1/x$ on $\mathbb {R}_+^*$, it comes

$$\begin{aligned}&\mathbb {E}[\det (\phi (Y)^T\phi (Y)+\varLambda )^{-1}]\\&\quad \ge \mathbb {E}[\det (\phi (Y)^T\phi (Y)+\varLambda )]^{-1}\\&\quad =\left( \nu (\varOmega )^{-k}\int _{\varOmega ^k} \det (\phi (y)^T\phi (y)+\varLambda )\textrm{d}\nu ^k(y)\right) ^{-1}. \end{aligned}$$

Now, in Sect. C we showed that

$$\begin{aligned}&\mathbb {E}[\det (\phi (X)^T\phi (X)+\varLambda )^{-1}\big | |X|=k]\\&\quad =\frac{\frac{1}{k!}\int _{\varOmega ^k}j_k(x)\det (\phi (x)^T\phi (x)+\varLambda )^{-1}\textrm{d}{\nu }^k(x)}{\frac{1}{k!}\int _{\varOmega ^k}j_k(x)\textrm{d}{\nu }^k(x)}\\&\quad =\frac{\frac{\nu (\varOmega )^k}{\exp (\nu (\varOmega ))}\det (G_\nu (\phi )+\varLambda )^{-1}}{\int _{\varOmega ^k}\frac{\det (\phi (x)^T\phi (x)+\varLambda )}{\det (G_\nu (\phi )+\varLambda )\exp (\nu (\varOmega ))}\textrm{d}\nu ^k(x)}\\&\quad =\nu (\varOmega )^k\left( \int _{\varOmega ^k}\det (\phi (x)^T\phi (x)+\varLambda )\textrm{d}\nu ^k(x)\right) ^{-1} \end{aligned}$$

which concludes the proof.

E Proof of Proposition 3.5

By definition of the Janossy densities, we have

$$\begin{aligned}{} & {} \mathbb {E}\left[ \textrm{Tr}((\phi (X)^T\phi (X)+\varLambda )^{-1})\big | |X|=k\right] \nonumber \\{} & {} \quad =\frac{\frac{1}{k!}\int _{\varOmega ^k}j_k(x)\textrm{Tr}((\phi (x)^T\phi (x)+\varLambda )^{-1})\textrm{d}{\nu }^k(x)}{\frac{1}{k!}\int _{\varOmega ^k}j_k(x)\textrm{d}{\nu }^k(x)}. \end{aligned}$$

(E.1)

Using the same notation as in Sect. A, we expand the numerator into

$$\begin{aligned}&\frac{1}{k!}\int _{\varOmega ^k}j_k(x) \textrm{Tr}((\phi (x)^T\phi (x)+\varLambda )^{-1}) \textrm{d}{\nu }^k(x)\nonumber \\&\quad =\frac{1}{k!}\int _{\varOmega ^k}j_k(x)\textrm{Tr}((\psi (x)^T\psi (x)+D_\lambda )^{-1})\textrm{d}{\nu }^k(x)\nonumber \\&\quad =\sum _{i=1}^p\int _{\varOmega ^n}\frac{\Delta _{i,i}(\psi (x)^T\psi (x)+D_{\lambda })}{\det (G_\nu (\phi )+\varLambda )\exp (\nu (\varOmega ))}\textrm{d}\nu ^k(x). \end{aligned}$$

(E.2)

Now,

$$\begin{aligned}{} & {} \int _{\varOmega ^n}\Delta _{i,i}(\psi (x)^T\psi (x)+D_{\lambda })\textrm{d}\nu ^k(x)\nonumber \\{} & {} \quad =\sum _{S\subset [p]\backslash \{i\}}\lambda ^{S^c}\det (G_\nu (\psi _S))\frac{\nu (\varOmega )^{k-|S|}}{(k-|S|)!}, \end{aligned}$$

(E.3)

where in this case $S^c$ denotes the complement of S relative to $[p]\backslash \{i\}$. Note that there are exactly $\dim (\text {Ker}(\varLambda ))$ eigenvalues of $\varLambda $ equal to 0, so that the elements in the sum in (E.3) are equal to 0 when $|S|\le m_0-1$. Since $\nu (\varOmega )=k$, the sequence $i\mapsto \nu (\varOmega )^i/i!$ is increasing when $i\le k$, so that

$$\begin{aligned}&\int _{\varOmega ^n}\Delta _{i,i}(\psi (x)^T\psi (x)+D_{\lambda })\textrm{d}\nu ^k(x)\\&\quad \le \sum _{S\subset [p]\backslash \{i\}}\lambda ^{S^c}\det (G_\nu (\psi _S))\frac{\nu (\varOmega )^{k+1-m_0}}{(k+1-m_0)!}\nonumber \\&\quad =\frac{\nu (\varOmega )^{k+1-m_0}}{(k+1-m_0)!}\Delta _{i,i}(G_\nu (\psi )+D_{\lambda }) \end{aligned}$$

which, combined with (E.2), gives

$$\begin{aligned} \frac{1}{k!}\int _{\varOmega ^k}j_k(x)\textrm{Tr}((\phi (x)^T\phi (x)+\varLambda )^{-1})\textrm{d}{\nu }^k(x)\le \nonumber \\ \frac{\nu (\varOmega )^{k+1-m_0}\textrm{Tr}((G_\nu (\phi )+\varLambda )^{-1})}{(k+1-m_0)!\exp (\nu (\varOmega ))}. \end{aligned}$$

(E.4)

Finally, combining (E.1), (E.4) and (C.3) gives

$$\begin{aligned}&\mathbb {E}\left[ \det (\phi (X)^T\phi (X)+\varLambda )^{-1}\big | |X|=k\right] \\&\quad \le \frac{\frac{\nu (\varOmega )^{k+1-m_0}\textrm{Tr}((G_\nu (\phi )+\varLambda )^{-1})}{(k+1-m_0)!\exp (\nu (\varOmega ))}}{\frac{\nu (\varOmega )^{k-p}}{(k-p)!\exp (\nu (\varOmega ))}}\\&\quad =\frac{\nu (\varOmega )^{p+1-m_0}(k-p)!}{(k+1-m_0)!}\textrm{Tr}\big ((G_\nu (\phi )+\varLambda )^{-1}\big ) \end{aligned}$$

and since $\nu (\varOmega )=k$, this concludes the proof.

F Proof of Proposition 3.7

The Janossy densities and correlation functions of a point process are linked by the following identity; see Daley and Vere -Jones (2003, Lemma 5.4.III):

$$\begin{aligned} \rho _n(x_1,\cdots ,x_n)=\sum _{m\ge 0}\frac{1}{m!}\int _{\varOmega ^m}j_{n+m}(x,y)\textrm{d}{\nu }^m(y). \end{aligned}$$

Applying this identity to the Janossy densities of $\mathbb {P}_{\textrm{VS}}^{\nu }(\phi ,\varLambda )$, we get

$$\begin{aligned} \rho _n(x_1,\cdots ,x_n)=\\ \sum _{m\ge 0}\frac{1}{m!}\int _{\varOmega ^m}\frac{\det (\phi (x)^T\phi (x)+\phi (y)^T\phi (y)+\varLambda )}{\det (G_\nu (\phi )+\varLambda )\exp (\nu (\varOmega ))}\textrm{d}\nu ^m(y). \end{aligned}$$

Now, for all $x_1,\cdots ,x_n\in \varOmega $, using the same reasoning as in the proof of normalization in Sect. A but replacing the matrix $\varLambda $ with the matrix $\phi (x)^T\phi (x)+\varLambda $, we get

$$\begin{aligned} \sum _{m\ge 0}\frac{1}{m!}\int _{\varOmega ^m}\det (\phi (x)^T\phi (x)+\phi (y)^T\phi (y)+\varLambda )\textrm{d}\nu ^m(y)\\ \quad =\det (\phi (x)^T\phi (x)+\varLambda )\exp (\nu (\varOmega )). \end{aligned}$$

We then conclude that

$$\begin{aligned} \rho _n(x_1,\cdots ,x_n)=\frac{\det (G_\nu (\phi )+\varLambda +\phi (x)^T\phi (x))}{\det (G_\nu (\phi )+\varLambda )}. \end{aligned}$$

G Proof of Proposition 3.8

For any $n\in \mathbb {N}$ and $x\in \varOmega ^n$, we write K[x] for the $n\times n$ matrix with entries $K(x_i,x_j)$. Since $G_\nu (\phi )+\varLambda $ is invertible, then

$$\begin{aligned} \rho _n(x)=&\det \left( I_p+(G_\nu (\phi )+\varLambda )^{-1}\phi (x)^T\phi (x)\right) \nonumber \\&\quad =\det \left( I_n+\phi (x)(G_\nu (\phi )+\varLambda )^{-1}\phi (x)^T\right) .\nonumber \\&\quad =\det \left( I_n+K[x]\right) . \end{aligned}$$

(G.1)

Now, it remains to show that the superposition of X and Y has the same correlation functions to conclude that its distribution is $\mathbb {P}_{\textrm{VS}}^{\nu }(\phi ,A)$.

Let $n\in \mathbb {N}$, we recall that the nth-order correlation function $\rho '_n$ of $X\cup Y$ satisfy

$$\begin{aligned}{} & {} \mathbb {E}\left[ \sum _{x_1,\cdots ,x_n\in X\cup Y}^{\ne }f(x_1,\cdots ,x_n)\right] \nonumber \\{} & {} \quad =\int _{\varOmega ^n}f(x_1,\cdots ,x_n)\rho '_n(x_1,\cdots ,x_n)\textrm{d}{\nu }(x_1)\cdots \textrm{d}{\nu }(x_n) \end{aligned}$$

(G.2)

for all integrable functions f, where the $\ne $ symbol means that the sum is taken on distinct elements of $X\cup Y$. Since each element of $X\cup Y$ is either in X or Y then (G.2) can be rewritten as

$$\begin{aligned}&\mathbb {E}\left[ \sum _{x_1,\cdots ,x_n\in X\cup Y}^{\ne }f(x_1,\cdots ,x_n)\right] \\&\quad =\sum _{S\subset [n]}\mathbb {E}\left[ \sum _{x_i\in X, i\in S}^{\ne }\sum _{x_j\in Y, j\in S^c}^{\ne }f(x_1,\cdots ,x_n)\right] \\&\quad =\sum _{S\subset [n]}\mathbb {E}\left[ \sum _{x_i\in X, i\in S}^{\ne }\mathbb {E}\left[ \sum _{x_j\in Y, j\in S^c}^{\ne }f(x_1,\cdots ,x_n)\right] \right] \\&\quad = \sum _{S\subset [n]}\mathbb {E}\left[ \sum _{x_i\in X, i\in S}^{\ne }\int _{\varOmega ^{|S^c|}}f(x_1,\cdots ,x_n)\prod _{j\in S^C}\textrm{d}\nu (x_j)\right] \\&\quad =\sum _{S\subset [n]}\int _{\varOmega ^n}f(x_1,\cdots ,x_n)\det ((K(x_i,x_j))_{i,j\in S})\textrm{d}\nu ^n(x)\\&\quad =\int _{\varOmega ^n}f(x_1,\cdots ,x_n)\det (I_n+K[x])\textrm{d}\nu ^n(x). \end{aligned}$$

This proves that the correlation functions of $X\cup Y$ also satisfy

$$\begin{aligned} \rho '_n(x)=\det (I_n+K[x]). \end{aligned}$$

Therefore, $X\cup Y$ is distributed as $\mathbb {P}_{\textrm{VS}}^{\nu }(\phi ,A)$.

H Proof of Corollary 3.10

$X\sim \mathbb {P}_{\textrm{VS}}^{\nu }(\phi ,\varLambda )$ is the superposition of a Poisson point process Y with intensity $\nu $ and a DPP Z with intensity $\rho (x)\textrm{d}x=\phi (x)(G_\nu (\phi )+\varLambda )^{-1}\phi (x)^T$; see identity (G.1). Therefore,

$$\begin{aligned} \mathbb {E}[|X|]=\mathbb {E}[|Y|]+\mathbb {E}[|Z|] \end{aligned}$$

with $\mathbb {E}[|Y|]=\nu (\varOmega )$ and

$$\begin{aligned} \mathbb {E}[|Z|]=\int _\varOmega \phi (x)(G_\nu (\phi )+\varLambda )^{-1}\phi (x)^T\textrm{d}\nu (x). \end{aligned}$$

Since we can rewrite $\phi (x)(G_\nu (\phi )+\varLambda )^{-1}\phi (x)^T$ as $\textrm{Tr}((G_\nu (\phi )+\varLambda )^{-1}\phi (x)^T\phi (x))$, we get

$$\begin{aligned} \mathbb {E}[|Z|]=\textrm{Tr}((G_\nu (\phi )+\varLambda )^{-1}G_\nu (\phi )), \end{aligned}$$

concluding the proof.

I A parametrized reference measure for Sect. 3.4.

To parametrize $\nu $, we write its density f as a linear combination of positive functions with nonnegative weights, that is,

$$\begin{aligned} f(x)=\sum _{i=1}^n\omega _i g_i(x). \end{aligned}$$

(1.1)

This way, minimizing $h(G_\nu (\phi ))$ over $\nu =f\textrm{d}x$ of the form (1.1) and such that $\nu (\varOmega )=k$ is equivalent to finding $(\omega _1,\cdots ,\omega _n)$ minimizing

$$\begin{aligned}{} & {} h\left( \sum _{i=1}^n \omega _iG_{g_i}(\phi ) + \varLambda \right) ~\text{ s.t. }~\omega \succcurlyeq 0~ \nonumber \\{} & {} \text{ and }~\sum _{i=1}^n\omega _i\int _\varOmega g_i(x)\textrm{d}x=k. \end{aligned}$$

(1.2)

This is now a convex optimization problem that can be solved numerically. For our illustration, we consider that $h\in \{h_D,h_A\}$ and the $g_i$ to be the 231 polynomial functions of two variables with degree $\le 10$ as well as their composition with $(x,y)\mapsto (1-x,1-y)$, which are all nonnegative functions on $\varOmega =[0,1]^2$. We show in Fig. 7 the density of the measures minimizing (1.2) for both optimality criteria and for $\varLambda \in \{I_{10}, 0.01 I_{10}, 0.0001 I_{10}\}$.

J Performance of PVS when dealing with both qualitative and quantitative variables

Following a similar idea as in Sect. 3.4 and Atkinson et al (2007, Section 14), we consider the design space $\varOmega =[0,1]^2\times \{0,1\}$ and $k=p=11$, with the regression functions $\phi _i$, for $i\le 10$, being the 10 bivariate polynomials of degree $\le 3$ and $\phi _{11}(x,y,z):=\mathbbm {1}_{z=1}$. We consider the non-Bayesian setting where $\varLambda =0$. Following our approach in Sect. I, we parameterize $\nu $ as a linear combination (with nonnegative weights) of the 462 functions of the form $(x,y,z)\mapsto P(x,y)\mathbbm {1}_{z=i}$, where P is a polynomial function of degree $\le 10$ and $i\in \{0,1\}$. We find the optimal weights numerically by solving the associated convex optimization problem to get an optimized measure $\nu ^*$.

We show in Fig. 9 an example of design generated by PVS with or without an optimized measure, compared to a uniformly drawn design and an optimal one. We also show in Fig. 8 the performance of PVS and i.i.d. designs with reference measure being either uniform or $\nu ^*$. The results are very similar result to those in Fig. 3c, where we did not have qualitative factors. This shows that the addition of qualitative factors does not deter the performance of PVS.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Poinas, A., Bardenet, R. On proportional volume sampling for experimental design in general spaces. Stat Comput 33, 29 (2023). https://doi.org/10.1007/s11222-022-10115-0

Download citation

Received: 22 February 2021
Accepted: 05 June 2022
Published: 31 December 2022
DOI: https://doi.org/10.1007/s11222-022-10115-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

On proportional volume sampling for experimental design in general spaces

Abstract

Access this article

Similar content being viewed by others

Near-optimal discrete optimization for experimental design: a regret minimization approach

A Laplace-based algorithm for Bayesian adaptive design

Variance reduction in Monte Carlo sampling-based optimality gap estimators for two-stage stochastic linear programming

Notes

References

Acknowledgements