Computing a Quantity of Interest from Observational Data

DeVore, Ronald; Foucart, Simon; Petrova, Guergana; Wojtaszczyk, Przemyslaw

doi:10.1007/s00365-018-9433-7

Computing a Quantity of Interest from Observational Data

Published: 01 June 2018

Volume 49, pages 461–508, (2019)
Cite this article

Constructive Approximation Aims and scope

Ronald DeVore¹,
Simon Foucart¹,
Guergana Petrova¹ &
…
Przemyslaw Wojtaszczyk^2,3

895 Accesses
12 Citations
Explore all metrics

Abstract

Scientific problems often feature observational data received in the form $w_1=l_1(f),\ldots $,$w_m=l_m(f)$ of known linear functionals applied to an unknown function f from some Banach space $\mathcal {X}$, and it is required to either approximate f (the full approximation problem) or to estimate a quantity of interest Q(f). In typical examples, the quantities of interest can be the maximum/minimum of f or some averaged quantity such as the integral of f, while the observational data consists of point evaluations. To obtain meaningful results about such problems, it is necessary to possess additional information about f, usually as an assumption that f belongs to a certain model class $\mathcal {K}$ contained in $\mathcal {X}$. This is precisely the framework of optimal recovery, which produced substantial investigations when the model class is a ball in a smoothness space, e.g., when it is a unit ball in Lipschitz, Sobolev, or Besov spaces. This paper is concerned with other model classes described by approximation processes, as studied in DeVore et al. [Data assimilation in Banach spaces, (To Appear)]. Its main contributions are: (1) designing implementable optimal or near-optimal algorithms for the estimation of quantities of interest, (2) constructing linear optimal or near-optimal algorithms for the full approximation of an unknown function using its point evaluations. While the existence of linear optimal algorithms for the approximation of linear functionals Q(f) is a classical result established by Smolyak, a numerically friendly procedure that performs this approximation is not generally available. In this paper, we show that in classical recovery settings, such linear optimal algorithms can be produced by constrained minimization methods. We illustrate these techniques on several examples involving the computation of integrals using point evaluation data. In addition, we show that linearization of optimal algorithms can be achieved for the full approximation problem in the important situation where the $l_j$ are point evaluations and $\mathcal {X}$ is a space of continuous functions equipped with the uniform norm. It is also revealed how quasi-interpolation theory enables the construction of linear near-optimal algorithms for the recovery of the underlying function.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

$\mathbf{C^{2}}$ -Lusin approximation of strongly convex functions

Article 03 April 2024

Daniel Azagra, Marjorie Drake & Piotr Hajłasz

Finding global minima via kernel approximations

Article 04 April 2024

Alessandro Rudi, Ulysse Marteau-Ferey & Francis Bach

Random Gradient-Free Minimization of Convex Functions

Article 30 November 2015

Yurii Nesterov & Vladimir Spokoiny

Notes

It is worth mentioning that the classical notion of a Chebyshev center of a set considered in this paper is different from the more computable notion considered in [8], which corresponds to the center of the largest ball contained in the given set.
There are various conditions on Q ensuring that $Q(\mathcal{K}_w(\epsilon ,V))$ is bounded, e.g., Q being a Lipschitz map.
The correction (4.4) may be omitted, since a pointwise near-optimal algorithm is already provided by $w\mapsto v(w)$, but it makes the algorithm A data-consistent.

References

Adcock, B., Hansen, A.C.: Stable reconstructions in Hilbert spaces and the resolution of the Gibbs phenomenon. Appl. Comput. Harm. Anal. 32, 357–388 (2012)
Article MathSciNet MATH Google Scholar
Adcock, B., Hansen, A.C., Poon, C.: Beyond consistent reconstructions: optimality and sharp bounds for generalized sampling, and application to the uniform resampling problem. SIAM J. Math. Anal. 45, 3132–3167 (2013)
Article MathSciNet MATH Google Scholar
Adcock, B., Platte, R.B., Shadrin, A.: Optimal sampling rates for approximating analytic functions from pointwise samples. IMA J. Numer. Anal. (2018). https://doi.org/10.1093/imanum/dry024
Google Scholar
Bakhvalov, N.S.: On the optimality of linear methods for operator approximation in convex classes of functions. USSR Comput. Math. Math. Phys. 11, 244–249 (1971)
Article MATH Google Scholar
Binev, P., Cohen, A., Dahmen, W., DeVore, R., Petrova, G., Wojtaszczyk, P.: Convergence rates for Greedy algorithms in reduced basis methods. SIAM J. Math. Anal. 43, 1457–1472 (2011)
Article MathSciNet MATH Google Scholar
Binev, P., Cohen, A., Dahmen, W., DeVore, R., Petrova, G., Wojtaszczyk, P.: Data assimilation in reduced modeling. SIAM/ASA J. Uncertain. Quant. 5, 1–29 (2017)
Article MathSciNet MATH Google Scholar
Bojanov, B.: Optimal recovery of functions and integrals. First European Congress of Mathematics, Vol. I (Paris, 1992), pp. 371–390, Progress in Mathematics, 119, Birkhäuser, Basel (1994)
Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)
Book MATH Google Scholar
Cohen, A., Dahmen, W., DeVore, R.: Compressed sensing and best $k$-term approximation. JAMS 22, 211–231 (2009)
MathSciNet MATH Google Scholar
Coppersmith, D., Rivlin, T.: The growth of polynomials bounded at equally spaced points. SIAM J. Math. Anal. 23, 970–983 (1992)
Article MathSciNet MATH Google Scholar
Creutzig, J., Wojtaszczyk, P.: Linear versus nonlinear algorithms for linear problems. J. Complex. 20, 807–820 (2004)
Article MATH Google Scholar
CVX Research, Inc. CVX: matlab software for disciplined convex programming, version 2.1. http://cvxr.com/cvx (2014)
DeVore, R.: Nonlinear approximation. Acta Numer. 7, 51–150 (1998)
Article MATH Google Scholar
DeVore, R., Lorentz, G.G.: Constructive Approximation, vol. 303. Springer Grundlehren, Berlin (1993)
MATH Google Scholar
DeVore, R., Petrova, G., Wojtaszczyk, P.: Data assimilation and sampling in Banach spaces. P. Calcolo 54, 1–45 (2017)
Article MathSciNet MATH Google Scholar
Driscoll, T.A., Hale, N., Trefethen, L.N. (eds.): Chebfun Guide. Pafnuty Publications, Oxford (2014)
Google Scholar
Elad, M.: Sparse and Redundant Representations. Springer, Berlin (2010)
Book MATH Google Scholar
Foucart, S., Rauhut, H.: A Mathematical Introduction to Compressive Sensing. Birkhäuser, Basel (2013)
Book MATH Google Scholar
Kalman, J.A.: Continuity and convexity of projections and barycentric coordinates in convex polyhedra. Pac. Math. J. 11, 1017–1022 (1961)
Article MathSciNet MATH Google Scholar
Lindenstrauss, J.: Extension property for compact operators. Mem. Am. Math. Soc. 48 (1964)
Marcinkiewicz, J., Zygmund, A.: Mean values of trigonometrical polynomials. Fundamenta Mathematicae 28, 131–166 (1937)
MATH Google Scholar
Micchelli, C., Rivlin, T.: Lectures on optimal recovery. Numerical analysis (Lancaster, 1984), 21–93, Lecture Notes in Math., 1129, Springer, Berlin (1985)
Micchelli, C., Rivlin, T., Winograd, S.: The optimal recovery of smooth functions. Numerische Mathematik 26, 191–200 (1976)
Article MathSciNet MATH Google Scholar
Milman, V., Schechtman, G.: Asymptotic Theory of Finite Dimensional Normed Spaces, Lecture Notes in Mathematics, vol. 1200. Springer, Berlin (1986)
MATH Google Scholar
Osipenko, KYu.: Best approximation of analytic functions from information about their values at a finite number of points. Math. Notes Acad. Sci. USSR 19(1), 17–23 (1976)
MathSciNet MATH Google Scholar
Platte, R., Trefethen, L., Kuijlaars, A.: Impossibility of fast stable approximation of analytic functions from equispaced samples. SIAM Rev. 53, 308–318 (2011)
Article MathSciNet MATH Google Scholar
Schönhage, A.: Fehlerfortpflantzung bei Interpolation. Numer. Math. 3, 62–71 (1961)
Article MathSciNet MATH Google Scholar
Traub, J., Wozniakowski, H.: A General Theory of Optimal Algorithms. Academic Press, New York (1980)
MATH Google Scholar
Turetskii, A.H.: The bounding of polynomials prescribed at equally distributed points. Proc. Pedag. Inst. Vitebsk 3, 117–127 (1940). [Russian]
Google Scholar
Wilson, M.W.: Necessary and sufficient conditions for equidistant quadrature formula. SIAM J. Numer. Anal. 7(1), 134–141 (1970)
Article MathSciNet MATH Google Scholar
Zippin, M.: Extension of Bounded Linear Operators, Handbook of the Geometry of Banach Spaces, vol. 2, pp. 1703–1741. North-Holland, Amsterdam (2003)
MATH Google Scholar
Zygmund, A.: Trigonometric Series. Cambridge University Press, Cambridge (2002)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics, Texas A&M University, College Station, TX, 77840, USA
Ronald DeVore, Simon Foucart & Guergana Petrova
Interdisciplinary Center for Mathematical and Computational Modelling, University of Warsaw, ul. Tyniecka 15/17, 02-630, Warsaw, Poland
Przemyslaw Wojtaszczyk
Institute of Mathematics, Polish Academy of Sciences, ul. Śniadeckich 8, 00-656, Warsaw, Poland
Przemyslaw Wojtaszczyk

Authors

Ronald DeVore
View author publications
You can also search for this author in PubMed Google Scholar
Simon Foucart
View author publications
You can also search for this author in PubMed Google Scholar
Guergana Petrova
View author publications
You can also search for this author in PubMed Google Scholar
Przemyslaw Wojtaszczyk
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Simon Foucart.

Additional information

Communicated by Wolfgang Dahmen.

This research was supported by the ONR Contracts N00014-15-1-2181, N00014-16-1-2706 the NSF Grant DMS 1521067, DARPA through Oak Ridge National Laboratory; by the NSF Grant DMS 1622134; and by National Science Centre, Poland Grant UMO-2016/21/B/ST1/00241.

Appendix

Finally, we provide full justifications for several unproven results we have relied on, namely (3.12), (4.1), and Lemma 4.1. We start with (3.12).

Lemma 6.1

For any polynomial $r\in \mathcal{P}_d$, one has

$$\begin{aligned} \Vert r\Vert ^2_{ C[-1,1]}\le \frac{(d+1)^2}{2}\Vert r\Vert ^2_{L_2[-1,1]}, \end{aligned}$$

(6.1)

and the inequality is sharp.

Proof

Let us consider the expansion of r with respect to the Legendre polynomials $P_j$ normalized so that $P_j(1)=1$ and $\Vert P_j\Vert _{L_2[-1,1]}^2=\dfrac{2}{2j+1}$; that is,

$$\begin{aligned} r(x)= & {} \sum _{j=0}^d\Vert P_j\Vert ^{-2}_{L_2[-1,1]}\left( \intop \limits _{-1}^1 r(y)P_j(y)\,dy\right) P_j(x) =\intop \limits _{-1}^1 \left( \sum _{j=0}^d\frac{P_j(x)P_j(y)}{\Vert P_j\Vert ^{2}_{L_2[-1,1]}}\right) r(y)\,dy \\=: & {} \intop \limits _{-1}^1 k(x,y) r(y)\,dy. \end{aligned}$$

Since

$$\begin{aligned} |r(x)| = \Big | \intop \limits _{-1}^1 k(x,y) r(y) dy \Big | \le \Vert k(x,\cdot )\Vert _{L_2[-1,1]} \Vert r\Vert _{L_2[-1,1]}, \end{aligned}$$

the statement in the lemma follows from the fact that

$$\begin{aligned} \Vert k(x,\cdot )\Vert _{L_2[-1,1]}^2= & {} \intop \limits _{-1}^1\left( \sum _{j=0}^d\frac{P_j(x)P_j(y)}{\Vert P_j\Vert _{L_2[-1,1]}^{2}}\right) ^2\,dy= \sum _{j=0}^d \frac{P^2_j(x)}{ \Vert P_j\Vert _{L_2[-1,1]}^{4}}\intop \limits _{-1}^1P^2_j(y)\,dy\\= & {} \sum _{j=0}^d \frac{P^2_j(x)}{\Vert P_j\Vert _{L_2[-1,1]}^{2}} \le \sum \limits _{j=0}^d \frac{1}{\Vert P_j\Vert _{L_2[-1,1]}^{2}} = \sum _{j=0}^d \frac{2j+1}{2}= \frac{(d+1)^2}{2}. \end{aligned}$$

Inequality (6.1) is sharp because all inequalities become equalities for $r=k(1,\cdot )$. $\square $

Next, we continue by restating (4.1).

Lemma 6.2

Let V be an n-dimensional subspace of C(D) and $x_1,\ldots ,x_m \in D$ be m distinct points in D. If $\mathcal{N}:= \{ \eta \in C(D): \eta (x_1)= \cdots = \eta (x_m)=0 \}$, then

$$\begin{aligned} \mu (\mathcal{N},V)_{C(D)} = 1 + \mu (V,\mathcal{N})_{C(D)}. \end{aligned}$$

Proof

In view of (1.2), it is enough to establish that

$$\begin{aligned} \mu (\mathcal{N},V)_{C(D)} \ge 1 + \mu (V,\mathcal{N})_{C(D)}. \end{aligned}$$

(6.2)

Let us define

$$\begin{aligned} \mu := \mu (V,\mathcal{N})_{C(D)} = \max _{v \in V} \frac{\Vert v\Vert _{C(D)}}{\displaystyle {\max _{1 \le j \le m} |v(x_j)|}}, \end{aligned}$$

and pick $v \in V$ with $\max _{1 \le j \le m} |v(x_j)| = 1$ and $\Vert v\Vert _{C(D)} = \mu $. If $ \mu > 1$, choose $x^* \in D$ such that $|v(x^*)| = \mu $ and therefore $x^* \not \in \{x_1,\ldots ,x_m\}$. If $\mu = 1$, choose $x^* \in D \setminus \{x_1,\ldots ,x_m\}$ such that $|v(x^*)| \ge \mu - \delta $ for an arbitrarily small $\delta > 0$. We introduce a function $h \in C(D)$ satisfying

$$\begin{aligned} h(x_j) = v(x_j), \; j=1,\ldots ,m, \qquad h(x^*) = -\mathrm{sgn}(v(x^*)), \qquad \Vert h\Vert _{C(D)} = 1. \end{aligned}$$

Clearly, the function $\eta := v-h$ belongs to $\mathcal{N}$, and we have

$$\begin{aligned} \mu (\mathcal{N},V)_{C(D)} \ge \frac{\Vert \eta \Vert _{C(D)}}{\Vert \eta - v\Vert _{C(D)}} = \frac{\Vert v-h\Vert _{C(D)}}{\Vert h\Vert _{C(D)}} \ge |v(x^*)-h(x^*)| \ge \mu - \delta + 1. \end{aligned}$$

Since $\delta > 0$ was arbitrary, this proves (6.2). $\square $

Finally, we prove Lemma 4.1 stated in a slightly different version below.

Lemma 6.3

Let $\theta _1,\ldots ,\theta _N$ be N distinct points in $\mathbb {R}^n$ with convex hull $\mathcal{C}:= \mathrm{conv}\{\theta _1,\ldots ,\theta _N\}$. Then, there exist functions $\psi ^{(N)}_j:\mathcal{C}\rightarrow \mathbb {R}$, $j=1,\ldots ,N$, such that

(i)
$\psi ^{(N)}_1,\ldots ,\psi ^{(N)}_N$ are continuous on $\mathcal{C}$;
(ii)
for any linear function $\lambda : \mathbb {R}^n \rightarrow \mathbb {R}$ (in particular for $\lambda (\theta )=1$ and $\lambda (\theta )=\theta $),
$$\begin{aligned} \sum _{i=1}^N \psi ^{(N)}_i(\theta ) \lambda (\theta _i) = \lambda (\theta ) \qquad \text{ whenever } \theta \in \mathcal{C}; \end{aligned}$$
(iii)
for all $i=1,\ldots , N$, $\psi ^{(N)}_i(\theta ) \ge 0$ whenever $\theta \in \mathcal{C}$;
(iv)
for all $i,j = 1,\ldots ,N$, $\psi ^{(N)}_i(\theta _j) = \delta _{i,j}$.

Proof

We proceed by induction on $N \ge 1$. The result is clear for $N=1$ and $N=2$. Let us assume that it holds up to $N-1$ for some integer $N \ge 3$ and that we are given N distinct points $\theta _1,\ldots ,\theta _N \in \mathbb {R}^n$. We separate two cases.

Case 1: Each $\theta _j$ is an extreme point of $\mathcal{C}:= \mathrm{conv} \{ \theta _1,\ldots , \theta _N \}$. In this case, we invoke the result of Kalman [19] and consider the functions $\psi ^{(N)}_1,\ldots ,\psi ^{(N)}_N$ from [19] satisfying (i)–(iii). Condition (iv) then occurs as a consequence of (ii)-(iii). Indeed, given $j = 1,\ldots ,N$, one can find a linear function $\lambda : \mathbb {R}^n \rightarrow \mathbb {R}$ such that $\lambda (\theta _j)=0$ and $\lambda (\theta _i) >0$ for all $i \not = j$. Therefore,

$$\begin{aligned} \sum _{i=1}^N \psi ^{(N)}_i(\theta _j) \lambda (\theta _i) = \lambda (\theta _j)=0 \end{aligned}$$

implies that $\psi ^{(N)}_i(\theta _j) =0 $ for all $i \not = j$, and then $\psi ^{(N)}_j(\theta _j) =1 $ follows from $\sum _{i=1}^N \psi ^{(N)}_i(\theta _j) =1$.

Case 2: One of the $\theta _j$’s belongs to the convex hull of the other $\theta _i$’s, say $\theta _N \in \mathrm{conv} \{ \theta _1,\ldots , \theta _{N-1} \}$. Let $\psi ^{(N-1)}_1,\ldots ,\psi ^{(N-1)}_{N-1}$ be the functions defined on $\mathcal{C}= \mathrm{conv} \{ \theta _1,\ldots , \theta _{N-1} \} = \mathrm{conv} \{ \theta _1,\ldots , \theta _N \}$ that are obtained from the induction hypothesis applied to the $N-1$ distinct points $\theta _1,\ldots ,\theta _{N-1}$. Next, we introduce the set $\Omega $, which has at least two elements, and the function $\tau $, which is continuous on $\mathcal{C}$, given by

$$\begin{aligned} \Omega := \{ j = 1,\ldots ,N-1: \psi ^{(N-1)}_j(\theta _N) > 0 \}, \quad \tau (\theta ) := \min _{j \in \Omega } \frac{\psi ^{(N-1)}_j(\theta )}{\psi ^{(N-1)}_j(\theta _N)}, \quad \theta \in \mathcal{C}. \end{aligned}$$

Finally, we define functions $\psi ^{(N)}_1,\ldots ,\psi ^{(N)}_N$ by

$$\begin{aligned} \psi ^{(N)}_i(\theta ) := \psi ^{(N-1)}_i(\theta ) - \psi ^{(N-1)}_i(\theta _N) \tau (\theta ), \quad i=1,\ldots ,N-1, \quad \psi ^{(N)}_N(\theta ) := \tau (\theta ). \end{aligned}$$

These are continuous functions of $\theta \in \mathcal{C}$, so (i) is satisfied. To verify (ii), given a linear function $\lambda : \mathbb {R}^n \rightarrow \mathbb {R}$, we observe that

$$\begin{aligned} \sum _{i=1}^N \psi ^{(N)}_i (\theta ) \lambda (\theta _i)= & {} \sum _{i=1}^{N-1} \psi ^{(N-1)}_i (\theta ) \lambda (\theta _i) - \tau (\theta ) \sum _{i=1}^{N-1} \psi ^{(N-1)}_i (\theta _N) \lambda (\theta _i) + \tau (\theta ) \lambda (\theta _N)\\= & {} \lambda (\theta ) - \tau (\theta ) \lambda (\theta _N) + \tau (\theta ) \lambda (\theta _N) = \lambda (\theta ). \end{aligned}$$

As for (iii), given $\theta \in \mathcal{C}$, the fact that $\psi ^{(N)}_N(\theta ) \ge 0$ is clear from the definition of $\tau $, and for $i=1,\ldots ,N-1$, the fact that $\psi ^{(N)}_i(\theta ) \ge 0$ is equivalent to $\psi ^{(N-1)}_i(\theta _N) \tau (\theta ) \le \psi ^{(N-1)}_i(\theta )$, which is obvious if $i \not \in \Omega $ and follows from the definition of $\tau $ if $i \in \Omega $. Finally, to prove (iv), it is enough to verify that $\psi ^{(N)}_i(\theta _i) = 1$ for all $i=1,\ldots ,N$, which clearly holds for $i=N$, and for $i=1,\ldots ,N-1$, it is the identity $\psi ^{(N-1)}_i(\theta _N) \tau (\theta _i) = 0$, valid both when $i \not \in \Omega $ and when $i \in \Omega $, that implies $\psi ^{(N)}_i(\theta _i) = \psi ^{(N-1)}_i(\theta _i) =1$. We have now shown that the induction hypothesis holds for N, and this concludes the inductive proof. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

DeVore, R., Foucart, S., Petrova, G. et al. Computing a Quantity of Interest from Observational Data. Constr Approx 49, 461–508 (2019). https://doi.org/10.1007/s00365-018-9433-7

Download citation

Received: 13 June 2017
Revised: 09 February 2018
Accepted: 26 March 2018
Published: 01 June 2018
Issue Date: 15 June 2019
DOI: https://doi.org/10.1007/s00365-018-9433-7

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Computing a Quantity of Interest from Observational Data

Abstract

Access this article

Similar content being viewed by others

$\mathbf{C^{2}}$ -Lusin approximation of strongly convex functions

Finding global minima via kernel approximations

Random Gradient-Free Minimization of Convex Functions

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

Lemma 6.1

Proof

Lemma 6.2

Proof

Lemma 6.3

Proof

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Computing a Quantity of Interest from Observational Data

Abstract

Access this article

Similar content being viewed by others

$\mathbf{C^{2}}$ -Lusin approximation of strongly convex functions

Finding global minima via kernel approximations

Random Gradient-Free Minimization of Convex Functions

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

Appendix

Lemma 6.1

Proof

Lemma 6.2

Proof

Lemma 6.3

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation