Skip to main content
Log in

On proportional volume sampling for experimental design in general spaces

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

Optimal design for linear regression is a fundamental task in statistics. For finite design spaces, recent progress has shown that random designs drawn using proportional volume sampling (PVS for short) lead to polynomial-time algorithms with approximation guarantees that outperform i.i.d. sampling. PVS strikes the balance between design nodes that jointly fill the design space, while marginally staying in regions of high mass under the solution of a relaxed convex version of the original problem. In this paper, we examine some of the statistical implications of a new variant of PVS for (possibly Bayesian) optimal design. Using point process machinery, we treat the case of a generic Polish design space. We show that not only are known A-optimality approximation guarantees preserved, but we obtain similar guarantees for D-optimal design that tighten recent results. Moreover, we show that our PVS variant can be sampled in polynomial time. Unfortunately, in spite of its elegance and tractability, we demonstrate on a simple example that the practical implications of general PVS are likely limited. In the second part of the paper, we focus on applications and investigate the use of PVS as a subroutine for stochastic search heuristics. We demonstrate that PVS is a robust addition to the practitioner’s toolbox, especially when the regression functions are nonstandard and the design space, while low-dimensional, has a complicated shape (e.g., nonlinear boundaries, several connected components).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. Python code available at https://github.com/APoinas/Optimal-design-in-continuous-space

References

  • Andersen, M., Dahl, J., Liu, Z., Vandenberghe, L.: Interior-point methods for large-scale cone programming. In: Sra, S., Nowozin, S., Wright, S. (eds.) Optimization for Machine Learning, MIT Press, chap 1, pp. 55–83 (2012)

  • Atkinson, A., Donev, A., Tobias, R.: Optimum Experimental Designs, with SAS. Oxford University Press, Oxford Statistical Science Series, Oxford (2007)

    MATH  Google Scholar 

  • Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, USA (2004)

    Book  MATH  Google Scholar 

  • Collings, B.J.: Characteristic polynomials by diagonal expansion. Am. Stat. 37(3), 233–235 (1983)

    Google Scholar 

  • Daley, D.J., Vere-Jones, D.: An Introduction to the Theory of Point Processes, vol. I, 2nd edn. Springer, New York (2003)

    MATH  Google Scholar 

  • De Castro, Y., Gamboa, F., Henrion, D., Hess, R., Lasserre, J.: Approximate optimal designs for multivariate polynomial regression. Ann. Stat. 47(1), 127–155 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  • Dereziński, M., Warmuth, M., Hsu, D.: Leveraged volume sampling for linear regression. In: Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems, pp. 2510–2519 (2018)

  • Dereziński, M., Warmuth, M., Hsu, D.: Unbiased estimators for random design regression. ArXiv pre-print (2019)

  • Dereziński, M., Liang, F., Mahoney, M.: Bayesian experimental design using regularized determinantal point processes. In: Chiappa, S., Calandra, R. (Eds.) Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR, Online, Proceedings of Machine Learning Research, Vol. 108, pp. 3197–3207 (2020)

  • Dereziński, M., Mahoney, M.: Determinantal point processes in randomized numerical linear algebra. Not. Am. Math. Soc. 68, 1 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  • Dette, H.: Bayesian d-optimal and model robust designs in linear regression models. Stat.: J. Theor. Appl. Stat. 25(1), 27–46 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  • Dette, H., Studden, W.J.: The Theory of Canonical Moments with Applications in Statistics, Probability, and Analysis. Wiley Series in Probability and Statistics, Wiley, Hoboken (1997)

    MATH  Google Scholar 

  • Dette, H., Melas, V., Pepelyshev, A.: D-optimal designs for trigonometric regression models on a partial circle. Ann. Inst. Stat. Math. 54, 945–959 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  • Dick, J., Pilichshammer, F.: Digital Nets and Sequences. Discrepancy Theory and Quasi-Monte Carlo Integration. Cambridge University Press, Cambridge (2010)

    Book  Google Scholar 

  • Fang, K., Li, R., Sudjianto, A.: Design and modeling for computer experiments. Computer science and data analysis series, 1st edn. Chapman and Hall/CRC, Boca Raton (2006)

    MATH  Google Scholar 

  • Farrell, R.H., Kiefer, J., Walbran, A.: Optimum multivariate designs. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, 1: Statistics, University of California Press, pp. 113–138 (1967)

  • Fedorov, V.: Theory of Optimal Experiments Designs. Academic Press, New York (1972)

    Google Scholar 

  • Gautier, G., Bardenet, R., Valko, M.: On two ways to use determinantal point processes for Monte Carlo integration. Tech. rep., ICML workshop on Negative dependence in machine learning (2019a)

  • Gautier, G., Polito, G., Bardenet, R., Valko, M.: DPPy: DPP Sampling with Python. Journal of Machine Learning Research - Machine Learning Open Source Software (JMLR-MLOSS) (2019b)

  • Grove, D., Woods, D., Lewis, S.: Multifactor b-spline mixed models in designed experiments for the engine mapping problem. J. Qual. Technol. 36(4), 380–391 (2004)

  • Hough, J., Krishnapur, M., Peres, Y., Virag, B.: Zeros of Gaussian Analytic Functions and Determinantal Point Processes. American Mathematical Society, Providence (2009)

    Book  MATH  Google Scholar 

  • Hough, JB., Krishnapur, M., Peres, Y., Virág, B.: Determinantal processes and independence. Probability surveys (2006)

  • Johansson, K.: Random matrices and determinantal processes. Les Houches Summer School Proc. 83(C), 1–56 (2006)

    MathSciNet  MATH  Google Scholar 

  • Kulesza, A., Taskar, B.: Determinantal point processes for machine learning. Foundations and Trends in Machine Learning (2012)

  • Lavancier, F., Møller, J., Rubak, E.: Determinantal point process models and statistical inference. J. R. Stat. Soc.: Ser. B (Stat. Methodol.) 77, 853–877 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  • Liski, E., Mandal, N., Shah, K., Sinha, Ba.: Topics in Optimal Design, 1st edn. Lecture Notes in Statistics 163, Springer, New York (2002)

  • Liu, X., Yue, R.X., Chatterjee, K.: Geometric characterization of d-optimal designs for random coefficient regression models. Statist. Probab. Lett. 159, 108696 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  • Macchi, O.: The coincidence approach to stochastic point processes. Adv. Appl. Probab. 7, 83–122 (1975)

    Article  MathSciNet  MATH  Google Scholar 

  • Maronge, J., Zhai, Y., Wiens, D., Fang, Z.: Optimal designs for spline wavelet regression models. J. Stat. Plann. Inference 184, 94–104 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  • Nikolov, A., Singh, M., Tantipongpipat, UT.: Proportional volume sampling and approximation algorithms for a-optimal design. In: Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, Society for Industrial and Applied Mathematics, SODA ’19, 1369–1386 (2019)

  • Piepel, G., Stanfill, B., Cooley, S., Jones, B., Kroll, J., Vienna, J.: Developing a space-filling mixture experiment design when the components are subject to linear and nonlinear constraints. Qual. Eng. 31(3), 463–472 (2019). https://doi.org/10.1080/08982112.2018.1517887

    Article  Google Scholar 

  • Pronzato, L., Pázman, A.: Design of Experiments in Nonlinear Models: Asymptotic Normality, Optimality Criteria and Small-Sample Properties. Lecture Notes in Statistics, vol. 212. Springer-Verlag, New York (2013)

    MATH  Google Scholar 

  • Pukelsheim, F.: Optimal Design of Experiments. Classics in applied mathematics 50, Society for Industrial and Applied Mathematics (2006)

  • Pukelsheim, F., Rieder, S.: Efficient rounding of approximate designs. Biometrika 79(4), 763–770 (1992)

    Article  MathSciNet  Google Scholar 

  • Robert, C.P., Casella, G.: Monte Carlo Statistical Methods. Springer, New York (2004)

    Book  MATH  Google Scholar 

  • Summa, M., Eisenbrand, F., Faenza, Y., Moldenhauer, C.: On largest volume simplices and sub-determinants. In: Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms 2015 (2014)

  • Virtanen, P., Gommers, R., Oliphant, T., et al.: SciPy 1.0: fundamental algorithms for scientific computing in python. Nat. Methods 17, 261–272 (2020)

    Article  Google Scholar 

  • Woods, D., Lewis, S., Dewynne, J.: Designing experiments for multi-variable b-spline models. Sankhya 65, 660–670 (2003)

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

We thank Adrien Hardy for useful discussions throughout the project. We thank Michał Dereziński for his insightful comments and suggestions on an early draft. We acknowledge support from ERC grant Blackjack (ERC-2019-STG-851866) and ANR AI chair Baccarat (ANR-20-CHIA-0002).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arnaud Poinas.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

A Proof of the well-definedness of Definition 3.1

It is obvious that the Janossy densities are positive. Therefore, in order to prove that proportional volume sampling is well defined; see Daley and Vere-Jones (2003, Proposition 5.3.II.(ii)), we only need to show that

$$\begin{aligned} \sum _{n\ge 0}^\infty \frac{1}{n!}\int _{\varOmega ^n}j_n(x)\textrm{d}{\nu ^n}(x)=1. \end{aligned}$$
(A.1)

We write the eigenvalues of \(\varLambda \) as \(\lambda _1\le \cdots \le \lambda _p\) and the spectral decomposition of \(\varLambda \) as \(\varLambda =P^T D_\lambda P\), where \(D_\lambda \) is the \(p\times p\) diagonal matrix with the \(\lambda _i\) as its diagonal entries. Then, we define the functions \(\psi _i\), \(1\le i\le p\), by the linear transform of the function \(\phi _i\) defined by \((\psi _1(x),\cdots ,\psi _p(x)):=(\phi _1(x),\cdots ,\phi _p(x))P^T\). Finally, we have the decomposition

$$\begin{aligned} \det (\phi (x)^T\phi (x)+\varLambda )&=\det (P\phi (x)^T\phi (x)P^T+D_\lambda )\\&=\det (\psi (x)^T\psi (x)+D_\lambda )\\&=\sum _{S\subset [p]}\lambda ^{S^c}\det (\psi _S(x)^T\psi _S(x)) \end{aligned}$$

where \(\psi _S:=(\psi _{S_1},\cdots ,\psi _{S_{|S|}})\) and \(\lambda ^{S^c}:=\prod _{i\notin S}\lambda _i\), with the usual convention \(\lambda ^{\emptyset }=1\); see Collings (1983). Now, by the discrete Cauchy–Binet formula,

$$\begin{aligned} \det (\psi _S(x)^T\psi _S(x))=\sum _{\begin{array}{c} T\subset [k] \\ |T|=|S| \end{array}}\det (\psi _S(x_T))^2 \end{aligned}$$

where \(x_T:=(x_{T_1},\cdots ,x_{T_{|T|}})\). And, by using the more general Cauchy-Binet formula (Johansson 2006), we get

$$\begin{aligned} \int _{\varOmega ^n}\det (\psi _S(x_T))^2\textrm{d}\nu ^n(x)=|T|!\det (G_\nu (\psi _S))\nu (\varOmega )^{n-|T|}. \end{aligned}$$

Therefore

$$\begin{aligned}&\sum _{n\ge 0}^\infty \frac{1}{n!} \int _{\varOmega ^n}\det (\phi (x)^T\phi (x)+\varLambda )\textrm{d}\nu ^n(x) \\ =&\sum _{n\ge 0}^\infty \frac{1}{n!}\sum _{S\subset [p]}\lambda ^{S^c}\int _{\varOmega ^n}\det (\psi _S(x)^T\psi _S(x))\textrm{d}\nu ^n(x)\\ =&\sum _{n\ge 0}^\infty \frac{1}{n!}\sum _{S\subset [p]}\lambda ^{S^c}\sum _{\begin{array}{c} T\subset [k] \\ |T|=|S| \end{array}}|T|!\det (G_\nu (\psi _S))\nu (\varOmega )^{n-|T|} \\ =&\sum _{n\ge 0}^\infty \frac{1}{n!}\sum _{S\subset [p]}\lambda ^{S^c}\left( {\begin{array}{c}n\\ |S|\end{array}}\right) |S|!\det (G_\nu (\psi _S))\nu (\varOmega )^{n-|S|} \\ =&\sum _{n\ge 0}^\infty \sum _{S\subset [p]}\frac{\nu (\varOmega )^{n-|S|}}{(n-|S|)!}\lambda ^{S^c}\det (G_\nu (\psi _S))\mathbbm {1}_{n\ge |S|} \\ =&\sum _{S\subset [p]}\lambda ^{S^c}\det (G_\nu (\psi _S))\sum _{n\ge |S|}^\infty \frac{\nu (\varOmega )^{n-|S|}}{(n-|S|)!}\\ =&\det (G_\nu (\psi )+D_\lambda )\exp (\nu (\varOmega )) \\ =&\det (G_\nu (\phi )+\varLambda )\exp (\nu (\varOmega )) \end{aligned}$$

where, in the last two identities, we used the facts that (i) \(G_\nu (\psi _S)\) is equal to \(G_\nu (\psi )_S\), the submatrix of \(G_\nu (\psi )\) whose rows and columns are indexed by S, and (ii) \(G_\nu (\psi )=G_\nu (\phi P^T)=PG_\nu (\psi )P^T\). This proves (A.1).

B Proof of Proposition 3.2

First, we write

$$\begin{aligned}{} & {} \mathbb {E}\left[ (\phi (X)^T\phi (X)+\varLambda )^{-1}\right] \\{} & {} \quad =\sum _{n\ge 0}\frac{1}{n!}\int _{\varOmega ^n}(\phi (x)^T\phi (x)+\varLambda )^{-1}j_n(x)\textrm{d}{\nu }^n(x). \end{aligned}$$

Since \((\phi (x)^T\phi (x)+\varLambda )^{-1}\det (\phi (x)^T\phi (x)+\varLambda )\) is the adjugate matrix of \((\phi (x)^T\phi (x)+\varLambda )\), its (ij) entry is

$$\begin{aligned} (-1)^{i+j}\det (\phi _{-j}(x)^T\phi _{-i}(x)+\varLambda _{-j,-i}), \end{aligned}$$

where we define \(\varLambda _{-j,-i}\) as the matrix \(\varLambda \) with its jth row and ith column removed, and \(\phi _{-i}(x)\) as the vector \(\phi (x)\) with its ith entry removed. Therefore, the (ij) entry of the matrix \(\mathbb {E}\left[ (\phi (X)^T\phi (X)+\varLambda )^{-1}\right] \) is

$$\begin{aligned} \sum _{n\ge 0}\frac{1}{n!}\int _{\varOmega ^n}\frac{(-1)^{i+j}\det (\phi _{-j}(x)^T\phi _{-i}(x)+\varLambda _{-j,-i})}{\det (G_\nu (\phi )+\varLambda )\exp (\nu (\varOmega ))}\textrm{d}\nu ^n(x). \end{aligned}$$

Using the same reasoning as in the proof of normalization in Sect. A, we get that

$$\begin{aligned}{} & {} \sum _{n\ge 0}\frac{1}{n!}\int _{\varOmega ^n}\det (\phi _{-j}(x)^T\phi _{-i}(x)+\varLambda _{-j,-i})\textrm{d}\nu ^n(x)\nonumber \\{} & {} \quad =\det \left( (\langle \phi _a,\phi _b\rangle )_{\begin{array}{c} 1\le a,b\le p \\ a\ne j, b\ne i \end{array}}+\varLambda _{-j,-i}\right) \exp (\nu (\varOmega )). \end{aligned}$$
(B.1)

Note that the proof in Sect. A does not rely on any symmetricity argument, so that identity (B.1) can be proved in the same way. As a consequence, we get that

$$\begin{aligned}{} & {} \sum _{n\ge 0}\frac{1}{n!}\int _{\varOmega ^n}\frac{(-1)^{i+j}\det (\phi _{-j}(x)^T\phi _{-i}(x)+\varLambda _{-j,-i})}{\det (G_\nu (\phi )+\varLambda )\exp (\nu (\varOmega ))}\textrm{d}\nu ^n(x)\\{} & {} \quad =\frac{(-1)^{i+j}\Delta _{j,i}(G_\nu (\phi )+\varLambda )}{\det (G_\nu (\phi )+\varLambda )}, \end{aligned}$$

which is the (ij) entry of the inverse matrix of \(G_\nu (\phi )+\varLambda \). This proves identity (3.2).

Finally, the proof of identity (3.3) is straightforward:

$$\begin{aligned}&\mathbb {E}\left[ \det (\phi (X)^T\phi (X)+\varLambda )^{-1}\right] \\&\quad =\sum _{n\ge 0}\frac{1}{n!}\int _{\varOmega ^n}\frac{1}{\det (G_\nu (\phi )+\varLambda )\exp (\nu (\varOmega ))}\textrm{d}\nu ^n(x)\\&\quad =\frac{\exp (\nu (\varOmega ))}{\det (G_\nu (\phi )+\varLambda )\exp (\nu (\varOmega ))}\\&\quad =\det (G_\nu (\phi )+\varLambda )^{-1}. \end{aligned}$$

C Proof of Proposition 3.3

By definition of the Janossy densities, we have

$$\begin{aligned}{} & {} \mathbb {E}\left[ \det (\phi (X)^T\phi (X)+\varLambda )^{-1}\big | |X|=k\right] \nonumber \\{} & {} \quad =\frac{\frac{1}{k!}\int _{\varOmega ^k}j_k(x)\det (\phi (x)^T\phi (x)+\varLambda )^{-1}\textrm{d}{\nu }^k(x)}{\frac{1}{k!}\int _{\varOmega ^k}j_k(x)\textrm{d}{\nu }^k(x)}. \end{aligned}$$
(C.1)

The integral in the numerator simplifies to

$$\begin{aligned}&\int _{\varOmega ^k}j_k(x)\det (\phi (x)^T\phi (x)+\varLambda )^{-1}\textrm{d}{\nu }^k(x)\nonumber \\&\quad =\int _{\varOmega ^k}\frac{1}{\det (G_\nu (\phi )+\varLambda )\exp (\nu (\varOmega ))}\textrm{d}\nu ^k(x)\nonumber \\&\quad =\frac{\nu (\varOmega )^k}{\exp (\nu (\varOmega ))}\det (G_\nu (\phi )+\varLambda )^{-1}. \end{aligned}$$
(C.2)

As for the denominator of (C.1), following the lines of Sect. A leads to

$$\begin{aligned} \frac{1}{k!}\int _{\varOmega ^k}j_k(x)\textrm{d}{\nu }^k(x)=\frac{\sum _{S\subset [p]}\lambda ^{S^c}\det (G_\nu (\psi _S))\frac{\nu (\varOmega )^{k-|S|}}{(k-|S|)!}}{\det (G_\nu (\phi )+\varLambda )\exp (\nu (\varOmega ))}, \end{aligned}$$
(C.3)

where the \(\psi \) functions are defined the same way as in Sect. A. Recalling that

$$\begin{aligned} \sum _{S\subset [p]}\lambda ^{S^c}\det (G_\nu (\psi _S))=\det (G_\nu (\phi )+\varLambda ), \end{aligned}$$

we can rewrite the sum in (C.3) as

$$\begin{aligned}{} & {} \sum _{S\subset [p]}\lambda ^{S^c}\det (G_\nu (\psi _S))\frac{\nu (\varOmega )^{k-|S|}}{(k-|S|)!}\\{} & {} \quad =\frac{\nu (\varOmega )^{k-p}}{(k-p)!}\det (G_\nu (\phi )+\varLambda )\\{} & {} \qquad +\sum _{\begin{array}{c} S\subset [p] \\ S\ne [p] \end{array}}\lambda ^{S^c}\det (G_\nu (\psi _S))\left( \frac{\nu (\varOmega )^{k-|S|}}{(k-|S|)!}-\frac{\nu (\varOmega )^{k-p}}{(k-p)!}\right) . \end{aligned}$$

Now, since \(\nu (\varOmega )=k\), the sequence \(i\mapsto \nu (\varOmega )^i/i!\) is increasing when \(i\le k\). Hence, for all \(S\subset [p]\) such that \(S\ne [p]\),

$$\begin{aligned}{} & {} \frac{\nu (\varOmega )^{k-|S|}}{(k-|S|)!}-\frac{\nu (\varOmega )^{k-p}}{(k-p)!}\ge \\{} & {} \quad \frac{\nu (\varOmega )^{k-p+1}}{(k-p+1)!}-\frac{\nu (\varOmega )^{k-p}}{(k-p)!}=\frac{k^{k-p}}{(k-p)!}\times \frac{p-1}{k-p+1}. \end{aligned}$$

We thus obtain

$$\begin{aligned}{} & {} \sum _{S\subset [p]}\lambda ^{S^c}\det (G_\nu (\psi _S))\frac{\nu (\varOmega )^{k-|S|}}{(k-|S|)!}\nonumber \\{} & {} \quad \ge \frac{k^{k-p}}{(k-p)!}\left( \det (G_\nu (\phi )+\varLambda )+\right. \nonumber \\{} & {} \qquad \left. \frac{p-1}{k-p+1}\big (\det (G_\nu (\phi )+\varLambda )-\det (G_\nu (\phi ))\big )\right) . \end{aligned}$$
(C.4)

Finally, combining (C.1), (C.2), (C.4) and the fact that \(\nu (\varOmega )=k\), we get

$$\begin{aligned}&\mathbb {E}\left[ \det (\phi (X)^T\phi (X){+}\varLambda )^{-1}\big | |X|{=}k\right] \\&\quad \le \frac{\frac{k^k}{k!\exp (k)}\det (G_\nu (\phi ){+}\varLambda )^{-1}}{\frac{k^{k-p}}{(k-p)!\exp (k)}\left( 1{+}\frac{p{-}1}{k-p{+}1} \big (1{-}\det (G_\nu (\phi )(G_\nu (\phi ){+}\varLambda )^{-1})\!\big )\!\right) }\\&\quad = \frac{k^p(k-p)!}{k!}\frac{\det (G_\nu (\phi )+\varLambda )^{-1}}{1+\frac{p-1}{k-p+1}\big (1-\det (G_\nu (\phi )(G_\nu (\phi )+\varLambda )^{-1})\big )}, \end{aligned}$$

concluding the proof.

D Proof of Proposition 3.4

Using the convexity of \(x\mapsto 1/x\) on \(\mathbb {R}_+^*\), it comes

$$\begin{aligned}&\mathbb {E}[\det (\phi (Y)^T\phi (Y)+\varLambda )^{-1}]\\&\quad \ge \mathbb {E}[\det (\phi (Y)^T\phi (Y)+\varLambda )]^{-1}\\&\quad =\left( \nu (\varOmega )^{-k}\int _{\varOmega ^k} \det (\phi (y)^T\phi (y)+\varLambda )\textrm{d}\nu ^k(y)\right) ^{-1}. \end{aligned}$$

Now, in Sect. C we showed that

$$\begin{aligned}&\mathbb {E}[\det (\phi (X)^T\phi (X)+\varLambda )^{-1}\big | |X|=k]\\&\quad =\frac{\frac{1}{k!}\int _{\varOmega ^k}j_k(x)\det (\phi (x)^T\phi (x)+\varLambda )^{-1}\textrm{d}{\nu }^k(x)}{\frac{1}{k!}\int _{\varOmega ^k}j_k(x)\textrm{d}{\nu }^k(x)}\\&\quad =\frac{\frac{\nu (\varOmega )^k}{\exp (\nu (\varOmega ))}\det (G_\nu (\phi )+\varLambda )^{-1}}{\int _{\varOmega ^k}\frac{\det (\phi (x)^T\phi (x)+\varLambda )}{\det (G_\nu (\phi )+\varLambda )\exp (\nu (\varOmega ))}\textrm{d}\nu ^k(x)}\\&\quad =\nu (\varOmega )^k\left( \int _{\varOmega ^k}\det (\phi (x)^T\phi (x)+\varLambda )\textrm{d}\nu ^k(x)\right) ^{-1} \end{aligned}$$

which concludes the proof.

E Proof of Proposition 3.5

By definition of the Janossy densities, we have

$$\begin{aligned}{} & {} \mathbb {E}\left[ \textrm{Tr}((\phi (X)^T\phi (X)+\varLambda )^{-1})\big | |X|=k\right] \nonumber \\{} & {} \quad =\frac{\frac{1}{k!}\int _{\varOmega ^k}j_k(x)\textrm{Tr}((\phi (x)^T\phi (x)+\varLambda )^{-1})\textrm{d}{\nu }^k(x)}{\frac{1}{k!}\int _{\varOmega ^k}j_k(x)\textrm{d}{\nu }^k(x)}. \end{aligned}$$
(E.1)

Using the same notation as in Sect. A, we expand the numerator into

$$\begin{aligned}&\frac{1}{k!}\int _{\varOmega ^k}j_k(x) \textrm{Tr}((\phi (x)^T\phi (x)+\varLambda )^{-1}) \textrm{d}{\nu }^k(x)\nonumber \\&\quad =\frac{1}{k!}\int _{\varOmega ^k}j_k(x)\textrm{Tr}((\psi (x)^T\psi (x)+D_\lambda )^{-1})\textrm{d}{\nu }^k(x)\nonumber \\&\quad =\sum _{i=1}^p\int _{\varOmega ^n}\frac{\Delta _{i,i}(\psi (x)^T\psi (x)+D_{\lambda })}{\det (G_\nu (\phi )+\varLambda )\exp (\nu (\varOmega ))}\textrm{d}\nu ^k(x). \end{aligned}$$
(E.2)

Now,

$$\begin{aligned}{} & {} \int _{\varOmega ^n}\Delta _{i,i}(\psi (x)^T\psi (x)+D_{\lambda })\textrm{d}\nu ^k(x)\nonumber \\{} & {} \quad =\sum _{S\subset [p]\backslash \{i\}}\lambda ^{S^c}\det (G_\nu (\psi _S))\frac{\nu (\varOmega )^{k-|S|}}{(k-|S|)!}, \end{aligned}$$
(E.3)

where in this case \(S^c\) denotes the complement of S relative to \([p]\backslash \{i\}\). Note that there are exactly \(\dim (\text {Ker}(\varLambda ))\) eigenvalues of \(\varLambda \) equal to 0, so that the elements in the sum in (E.3) are equal to 0 when \(|S|\le m_0-1\). Since \(\nu (\varOmega )=k\), the sequence \(i\mapsto \nu (\varOmega )^i/i!\) is increasing when \(i\le k\), so that

$$\begin{aligned}&\int _{\varOmega ^n}\Delta _{i,i}(\psi (x)^T\psi (x)+D_{\lambda })\textrm{d}\nu ^k(x)\\&\quad \le \sum _{S\subset [p]\backslash \{i\}}\lambda ^{S^c}\det (G_\nu (\psi _S))\frac{\nu (\varOmega )^{k+1-m_0}}{(k+1-m_0)!}\nonumber \\&\quad =\frac{\nu (\varOmega )^{k+1-m_0}}{(k+1-m_0)!}\Delta _{i,i}(G_\nu (\psi )+D_{\lambda }) \end{aligned}$$

which, combined with (E.2), gives

$$\begin{aligned} \frac{1}{k!}\int _{\varOmega ^k}j_k(x)\textrm{Tr}((\phi (x)^T\phi (x)+\varLambda )^{-1})\textrm{d}{\nu }^k(x)\le \nonumber \\ \frac{\nu (\varOmega )^{k+1-m_0}\textrm{Tr}((G_\nu (\phi )+\varLambda )^{-1})}{(k+1-m_0)!\exp (\nu (\varOmega ))}. \end{aligned}$$
(E.4)

Finally, combining (E.1), (E.4) and (C.3) gives

$$\begin{aligned}&\mathbb {E}\left[ \det (\phi (X)^T\phi (X)+\varLambda )^{-1}\big | |X|=k\right] \\&\quad \le \frac{\frac{\nu (\varOmega )^{k+1-m_0}\textrm{Tr}((G_\nu (\phi )+\varLambda )^{-1})}{(k+1-m_0)!\exp (\nu (\varOmega ))}}{\frac{\nu (\varOmega )^{k-p}}{(k-p)!\exp (\nu (\varOmega ))}}\\&\quad =\frac{\nu (\varOmega )^{p+1-m_0}(k-p)!}{(k+1-m_0)!}\textrm{Tr}\big ((G_\nu (\phi )+\varLambda )^{-1}\big ) \end{aligned}$$

and since \(\nu (\varOmega )=k\), this concludes the proof.

F Proof of Proposition 3.7

The Janossy densities and correlation functions of a point process are linked by the following identity; see Daley and Vere -Jones (2003, Lemma 5.4.III):

$$\begin{aligned} \rho _n(x_1,\cdots ,x_n)=\sum _{m\ge 0}\frac{1}{m!}\int _{\varOmega ^m}j_{n+m}(x,y)\textrm{d}{\nu }^m(y). \end{aligned}$$

Applying this identity to the Janossy densities of \(\mathbb {P}_{\textrm{VS}}^{\nu }(\phi ,\varLambda )\), we get

$$\begin{aligned} \rho _n(x_1,\cdots ,x_n)=\\ \sum _{m\ge 0}\frac{1}{m!}\int _{\varOmega ^m}\frac{\det (\phi (x)^T\phi (x)+\phi (y)^T\phi (y)+\varLambda )}{\det (G_\nu (\phi )+\varLambda )\exp (\nu (\varOmega ))}\textrm{d}\nu ^m(y). \end{aligned}$$

Now, for all \(x_1,\cdots ,x_n\in \varOmega \), using the same reasoning as in the proof of normalization in Sect. A but replacing the matrix \(\varLambda \) with the matrix \(\phi (x)^T\phi (x)+\varLambda \), we get

$$\begin{aligned} \sum _{m\ge 0}\frac{1}{m!}\int _{\varOmega ^m}\det (\phi (x)^T\phi (x)+\phi (y)^T\phi (y)+\varLambda )\textrm{d}\nu ^m(y)\\ \quad =\det (\phi (x)^T\phi (x)+\varLambda )\exp (\nu (\varOmega )). \end{aligned}$$

We then conclude that

$$\begin{aligned} \rho _n(x_1,\cdots ,x_n)=\frac{\det (G_\nu (\phi )+\varLambda +\phi (x)^T\phi (x))}{\det (G_\nu (\phi )+\varLambda )}. \end{aligned}$$

G Proof of Proposition 3.8

For any \(n\in \mathbb {N}\) and \(x\in \varOmega ^n\), we write K[x] for the \(n\times n\) matrix with entries \(K(x_i,x_j)\). Since \(G_\nu (\phi )+\varLambda \) is invertible, then

$$\begin{aligned} \rho _n(x)=&\det \left( I_p+(G_\nu (\phi )+\varLambda )^{-1}\phi (x)^T\phi (x)\right) \nonumber \\&\quad =\det \left( I_n+\phi (x)(G_\nu (\phi )+\varLambda )^{-1}\phi (x)^T\right) .\nonumber \\&\quad =\det \left( I_n+K[x]\right) . \end{aligned}$$
(G.1)

Now, it remains to show that the superposition of X and Y has the same correlation functions to conclude that its distribution is \(\mathbb {P}_{\textrm{VS}}^{\nu }(\phi ,A)\).

Let \(n\in \mathbb {N}\), we recall that the nth-order correlation function \(\rho '_n\) of \(X\cup Y\) satisfy

$$\begin{aligned}{} & {} \mathbb {E}\left[ \sum _{x_1,\cdots ,x_n\in X\cup Y}^{\ne }f(x_1,\cdots ,x_n)\right] \nonumber \\{} & {} \quad =\int _{\varOmega ^n}f(x_1,\cdots ,x_n)\rho '_n(x_1,\cdots ,x_n)\textrm{d}{\nu }(x_1)\cdots \textrm{d}{\nu }(x_n) \end{aligned}$$
(G.2)

for all integrable functions f, where the \(\ne \) symbol means that the sum is taken on distinct elements of \(X\cup Y\). Since each element of \(X\cup Y\) is either in X or Y then (G.2) can be rewritten as

$$\begin{aligned}&\mathbb {E}\left[ \sum _{x_1,\cdots ,x_n\in X\cup Y}^{\ne }f(x_1,\cdots ,x_n)\right] \\&\quad =\sum _{S\subset [n]}\mathbb {E}\left[ \sum _{x_i\in X, i\in S}^{\ne }\sum _{x_j\in Y, j\in S^c}^{\ne }f(x_1,\cdots ,x_n)\right] \\&\quad =\sum _{S\subset [n]}\mathbb {E}\left[ \sum _{x_i\in X, i\in S}^{\ne }\mathbb {E}\left[ \sum _{x_j\in Y, j\in S^c}^{\ne }f(x_1,\cdots ,x_n)\right] \right] \\&\quad = \sum _{S\subset [n]}\mathbb {E}\left[ \sum _{x_i\in X, i\in S}^{\ne }\int _{\varOmega ^{|S^c|}}f(x_1,\cdots ,x_n)\prod _{j\in S^C}\textrm{d}\nu (x_j)\right] \\&\quad =\sum _{S\subset [n]}\int _{\varOmega ^n}f(x_1,\cdots ,x_n)\det ((K(x_i,x_j))_{i,j\in S})\textrm{d}\nu ^n(x)\\&\quad =\int _{\varOmega ^n}f(x_1,\cdots ,x_n)\det (I_n+K[x])\textrm{d}\nu ^n(x). \end{aligned}$$

This proves that the correlation functions of \(X\cup Y\) also satisfy

$$\begin{aligned} \rho '_n(x)=\det (I_n+K[x]). \end{aligned}$$

Therefore, \(X\cup Y\) is distributed as \(\mathbb {P}_{\textrm{VS}}^{\nu }(\phi ,A)\).

Fig. 7
figure 7

3D plots of the densities of the measures minimizing (1.2) for the D and A-optimality criterion when the \(g_i\) functions are the binomial polynomial of degree \(\le 10\) as well as their composition with \((x,y)\mapsto (1-x,1-y)\)

H Proof of Corollary 3.10

\(X\sim \mathbb {P}_{\textrm{VS}}^{\nu }(\phi ,\varLambda )\) is the superposition of a Poisson point process Y with intensity \(\nu \) and a DPP Z with intensity \(\rho (x)\textrm{d}x=\phi (x)(G_\nu (\phi )+\varLambda )^{-1}\phi (x)^T\); see identity (G.1). Therefore,

$$\begin{aligned} \mathbb {E}[|X|]=\mathbb {E}[|Y|]+\mathbb {E}[|Z|] \end{aligned}$$

with \(\mathbb {E}[|Y|]=\nu (\varOmega )\) and

$$\begin{aligned} \mathbb {E}[|Z|]=\int _\varOmega \phi (x)(G_\nu (\phi )+\varLambda )^{-1}\phi (x)^T\textrm{d}\nu (x). \end{aligned}$$

Since we can rewrite \(\phi (x)(G_\nu (\phi )+\varLambda )^{-1}\phi (x)^T\) as \(\textrm{Tr}((G_\nu (\phi )+\varLambda )^{-1}\phi (x)^T\phi (x))\), we get

$$\begin{aligned} \mathbb {E}[|Z|]=\textrm{Tr}((G_\nu (\phi )+\varLambda )^{-1}G_\nu (\phi )), \end{aligned}$$

concluding the proof.

I A parametrized reference measure for Sect. 3.4.

To parametrize \(\nu \), we write its density f as a linear combination of positive functions with nonnegative weights, that is,

$$\begin{aligned} f(x)=\sum _{i=1}^n\omega _i g_i(x). \end{aligned}$$
(1.1)

This way, minimizing \(h(G_\nu (\phi ))\) over \(\nu =f\textrm{d}x\) of the form (1.1) and such that \(\nu (\varOmega )=k\) is equivalent to finding \((\omega _1,\cdots ,\omega _n)\) minimizing

$$\begin{aligned}{} & {} h\left( \sum _{i=1}^n \omega _iG_{g_i}(\phi ) + \varLambda \right) ~\text{ s.t. }~\omega \succcurlyeq 0~ \nonumber \\{} & {} \text{ and }~\sum _{i=1}^n\omega _i\int _\varOmega g_i(x)\textrm{d}x=k. \end{aligned}$$
(1.2)

This is now a convex optimization problem that can be solved numerically. For our illustration, we consider that \(h\in \{h_D,h_A\}\) and the \(g_i\) to be the 231 polynomial functions of two variables with degree \(\le 10\) as well as their composition with \((x,y)\mapsto (1-x,1-y)\), which are all nonnegative functions on \(\varOmega =[0,1]^2\). We show in Fig. 7 the density of the measures minimizing (1.2) for both optimality criteria and for \(\varLambda \in \{I_{10}, 0.01 I_{10}, 0.0001 I_{10}\}\).

Fig. 8
figure 8

Violin and boxplots of the D-efficiency of 2000 random designs in the setting of Sect. J. The designs are either sampled i.i.d. from \(\nu \) or from \(\mathbb {P}_{\textrm{VS}}^{\nu }(\phi ,\varLambda )\), where the density \(\nu \) is either uniform on \(\varOmega \) or \(\nu ^*\). Dashdotted lines show the bounds of Proposition 3.3

Fig. 9
figure 9

Example draws of a few random designs mentioned in Sect. J. We display an i.i.d. uniform draw (a), the \(\mathbb {P}_{\textrm{VS}}^{\nu }(\phi ,0)\) distribution where \(\nu \) is either (b) the uniform distribution on \(\varOmega =[0,1]^2\times \{0,1\}\) or (c) the distribution on \(\varOmega \) with the optimized density \(\nu ^*\). Figure (d) shows an optimal design for comparison. The two different marks correspond to the two levels of the qualitative factor

J Performance of PVS when dealing with both qualitative and quantitative variables

Following a similar idea as in Sect. 3.4 and Atkinson et al (2007, Section 14), we consider the design space \(\varOmega =[0,1]^2\times \{0,1\}\) and \(k=p=11\), with the regression functions \(\phi _i\), for \(i\le 10\), being the 10 bivariate polynomials of degree \(\le 3\) and \(\phi _{11}(x,y,z):=\mathbbm {1}_{z=1}\). We consider the non-Bayesian setting where \(\varLambda =0\). Following our approach in Sect. I, we parameterize \(\nu \) as a linear combination (with nonnegative weights) of the 462 functions of the form \((x,y,z)\mapsto P(x,y)\mathbbm {1}_{z=i}\), where P is a polynomial function of degree \(\le 10\) and \(i\in \{0,1\}\). We find the optimal weights numerically by solving the associated convex optimization problem to get an optimized measure \(\nu ^*\).

We show in Fig. 9 an example of design generated by PVS with or without an optimized measure, compared to a uniformly drawn design and an optimal one. We also show in Fig. 8 the performance of PVS and i.i.d. designs with reference measure being either uniform or \(\nu ^*\). The results are very similar result to those in Fig. 3c, where we did not have qualitative factors. This shows that the addition of qualitative factors does not deter the performance of PVS.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Poinas, A., Bardenet, R. On proportional volume sampling for experimental design in general spaces. Stat Comput 33, 29 (2023). https://doi.org/10.1007/s11222-022-10115-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11222-022-10115-0

Keywords

Navigation