Skip to main content

Variable Fidelity Regression Using Low Fidelity Function Blackbox and Sparsification

  • Conference paper
  • First Online:
  • 1873 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9653))

Abstract

We consider construction of surrogate models based on variable fidelity samples generated by a high fidelity function (an exact representation of some physical phenomenon) and by a low fidelity function (a coarse approximation of the exact representation). A surrogate model is constructed to replace the computationally expensive high fidelity function. For such tasks Gaussian processes are generally used. However, if the sample size reaches a few thousands points, a direct application of Gaussian process regression becomes impractical due to high computational costs. We propose two approaches to circumvent this difficulty. The first approach uses approximation of sample covariance matrices based on the Nyström method. The second approach relies on the fact that engineers often can evaluate a low fidelity function on the fly at any point using some blackbox; thus each time calculating prediction of a high fidelity function at some point, we can update the surrogate model with the low fidelity function value at this point. So, we avoid issues related to the inversion of large covariance matrices — as we can construct model using only a moderate low fidelity sample size. We applied developed methods to a real problem, dealing with an optimization of the shape of a rotating disk.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Alexandrov, N.M., Nielsen, E.J., Lewis, R.M., Anderson, W.K.: First-order model management with variable-fidelity physics applied to multi-element airfoil optimization. Technical report, NASA (2000)

    Google Scholar 

  2. Álvarez, M.A., Lawrence, N.D.: Computationally efficient convolved multiple output Gaussian processes. J. Mach. Learn. Res. 12, 1425–1466 (2011)

    MathSciNet  MATH  Google Scholar 

  3. Armand, S.C.: Structural optimization methodology for rotating disks of aircraft engines. Technical report, National Aeronautics and Space Administration, Office of Management, Scientific and Technical Information Program (1995)

    Google Scholar 

  4. Bachoc, F.: Cross validation and maximum likelihood estimations of hyper-parameters of Gaussian processes with model misspecification. Comput. Stat. Data Anal. 66, 55–69 (2013)

    Article  MathSciNet  Google Scholar 

  5. Banerjee, S., Gelfand, A.E., Finley, A.O., Sang, H.: Gaussian predictive process models for large spatial data sets. J. Royal Stat. Soc. Ser. B (Statist. Method.) 70(4), 825–848 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  6. Belyaev, M., Burnaev, E., Kapushev, Y.: Gaussian process regression for structured data sets. In: Gammerman, A., Vovk, V., Papadopoulos, H. (eds.) SLDS 2015. LNCS, vol. 9047, pp. 106–115. Springer, Heidelberg (2015)

    Chapter  Google Scholar 

  7. Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006)

    MATH  Google Scholar 

  8. Boyle, P., Frean, M.: Dependent Gaussian processes. Adv. Neural Inf. Process. Syst. 17, 217–224 (2005)

    Google Scholar 

  9. Burnaev, E., Panov, M.: Adaptive design of experiments based on Gaussian processes. In: Gammerman, A., Vovk, V., Papadopoulos, H. (eds.) SLDS 2015. LNCS, vol. 9047, pp. 116–125. Springer, Heidelberg (2015)

    Chapter  Google Scholar 

  10. Burnaev, E.V., Zaytsev, A.A., Spokoiny, V.G.: The Bernstein-von Mises theorem for regression based on Gaussian processes. Russ. Math. Surv. 68(5), 954–956 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  11. Chang, W., Haran, M., Olson, R., Keller, K., et al.: Fast dimension-reduced climate model calibration and the effect of data aggregation. Ann. Appl. Stat. 8(2), 649–673 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  12. Doyen, P.: Porosity from seismic data: a geostatistical approach. Geophysics 53(10), 1263–1275 (1988)

    Article  Google Scholar 

  13. Drineas, P., Mahoney, M.W.: On the Nyström method for approximating a Gram matrix for improved kernel-based learning. J. Mach. Learn. Res. 6, 2153–2175 (2005)

    MathSciNet  MATH  Google Scholar 

  14. Druot, T., Alestra, S., Brand, C., Morozov, S.: Multi-objective optimization of aircrafts family at conceptual design stage. In: Design and Optimization Symposium. Albi, France, In Inverse Problems (2013)

    Google Scholar 

  15. Farshi, B., Jahed, H., Mehrabian, A.: Optimum design of inhomogeneous non-uniform rotating discs. Comput. Struct. 82(9), 773–779 (2004)

    Article  Google Scholar 

  16. Forrester, A.I.J., Sóbester, A., Keane, A.J.: Multi-fidelity optimization via surrogate modelling. Proc. Roy. Soc. A Math. Phys. Eng. Sci. 463(2088), 3251–3269 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  17. Forrester, A.I.J., Sóbester, A., Keane, A.J.: Engineering Design Via Surrogate Modelling: a Practical Guide. J. Wiley, Chichester (2008)

    Book  Google Scholar 

  18. Foster, L., Waagen, A., Aijaz, N., Hurley, M., Luis, A., Rinsky, J., Satyavolu, C., Way, M.J., Gazis, P., Srivastava, A.: Stable and efficient Gaussian process calculations. J. Mach. Learn. Res. 10, 857–882 (2009)

    MathSciNet  MATH  Google Scholar 

  19. Furrer, R., Genton, M.G., Nychka, D.: Covariance tapering for interpolation of large spatial datasets. J. Comput. Graphical Stat. 15(3), 502–523 (2006)

    Article  MathSciNet  Google Scholar 

  20. Golub, G.H., Van Loan, C.F.: Matrix Computations, vol. 3. JHU Press, Baltimore (2012)

    MATH  Google Scholar 

  21. Grihon, S., Burnaev, E., Belyaev, M., Prikhodko, P.: Surrogate modeling of stability constraints for optimization of composite structures. In: Koziel, S., Leifsson, L. (eds.) Surrogate-Based Modeling and Optimization, pp. 359–391. Springer, New York (2013)

    Chapter  Google Scholar 

  22. Han, Z., Görtz, S., Zimmermann, R.: Improving variable-fidelity surrogate modeling via gradient-enhanced kriging and a generalized hybrid bridge function. Aerosp. Sci. Technol. 25(1), 177–189 (2013)

    Article  Google Scholar 

  23. Hastie, T., Tibshirani, R., Friedman, J., Franklin, J.: The elements of statistical learning: data mining, inference and prediction. Math. Intell. 27(2), 83–85 (2005)

    Google Scholar 

  24. Hensman, J., Fusi, N., Lawrence, N.D.,Gaussian processes for big data. arXiv preprint arXiv: 1309.6835 (2013)

  25. Higdon, D., Gattiker, J., Williams, B., Rightley, M.: Computer model calibration using high-dimensional output. J. Am. Stat. Assoc. 103(482), 570–583 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  26. Huang, Z., Wang, C., Chen, J., Tian, H.: Optimal design of aeroengine turbine disc based on kriging surrogate models. Comput. Struct. 89(1), 27–37 (2011)

    Article  Google Scholar 

  27. Kennedy, M.C., O’Hagan, A.: Predicting the output from a complex computer code when fast approximations are available. Biometrika 87(1), 1–13 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  28. Koziel, S., Bekasiewicz, A., Couckuyt, I., Dhaene, T.: Efficient multi-objective simulation-driven antenna design using co-kriging. IEEE Trans. Antennas Propag. 62(11), 5900–5905 (2014)

    Article  MathSciNet  Google Scholar 

  29. Madsen, J.I., Langthjem, M.: Multifidelity response surface approximations for the optimum design of diffuser flows. Optim. Eng. 2(4), 453–468 (2001)

    Article  MATH  Google Scholar 

  30. Mohan, S.C., Maiti, D.K.: Structural optimization of rotating disk using response surface equation and genetic algorithm. Int. J. Comput. Methods Eng. Sci. Mech. 14(2), 124–132 (2013)

    Article  Google Scholar 

  31. Neal, R.M.: Monte carlo implementation of Gaussian process models for Bayesian regression and classification. arXiv preprint physics/9701026 (1997)

  32. Park, J.-S.: Optimal Latin-hypercube designs for computer experiments. J. Stat. Plann. Infer. 39(1), 95–111 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  33. Park, S., Choi, S.: Hierarchical Gaussian process regression. In: ACML, pp. 95–110 (2010)

    Google Scholar 

  34. Pepelyshev, A.: The role of the nugget term in the Gaussian process method. In: Giovagnoli, A., Atkinson, A.C., Torsney, B., May, C. (eds.) mODa 9-Advances in Model-Oriented Design and Analysis, pp. 149–156. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  35. Qian, Z., Seepersad, C.C., Joseph, V.R., Allen, J.K., Wu, C.F.: Building surrogate models based on detailed and approximate simulations. J. Mech. Des. 128(4), 668–677 (2006)

    Article  Google Scholar 

  36. Quiñonero-Candela, J., Rasmussen, C.E.: A unifying view of sparse approximate Gaussian process regression. J. Mach. Learn. Res. 6, 1939–1959 (2005)

    MathSciNet  MATH  Google Scholar 

  37. Rasmussen, C.E., Williams, C.K.I.: Gaussian processes for machine learning. The MIT Press, Cambridge (2006)

    MATH  Google Scholar 

  38. Shaby, B., Ruppert, D.: Tapered covariance: Bayesian estimation and asymptotics. J. Comput. Graphical Stat. 21(2), 433–452 (2012)

    Article  MathSciNet  Google Scholar 

  39. Shi, J.Q., Murray-Smith, R., Titterington, D.M.: Hierarchical Gaussian process mixtures for regression. Stat. Comput. 15(1), 31–41 (2005)

    Article  MathSciNet  Google Scholar 

  40. Sun, G., Li, G., Stone, M., Li, Q.: A two-stage multi-fidelity optimization procedure for honeycomb-type cellular materials. Comput. Mater. Sci. 49(3), 500–511 (2010)

    Article  Google Scholar 

  41. Sun, S., Zhao, J., Zhu, J.: A review of Nyström methods for large-scale machine learning. Inf. Fusion 26, 36–48 (2015)

    Article  Google Scholar 

  42. Titsias, M.K.: Variational learning of inducing variables in sparse Gaussian processes. In: International Conference on Artificial Intelligence and Statistics, pp. 567–574 (2009)

    Google Scholar 

  43. Williams, C.K.I., Barber, D.: Bayesian classification with Gaussian processes. IEEE Trans. Pattern Anal. Mach. Intell. 20(12), 1342–1351 (1998)

    Article  Google Scholar 

  44. Xu, W., Tran, T., Srivastava, R., Journel, A.: Integrating seismic data in reservoir modeling: the collocated cokriging alternative. Society of Petroleum Engineers, In: SPE Annual Technical Conference and Exhibition (1992)

    Google Scholar 

  45. Zahir, M.K., Gao, Z.: Variable fidelity surrogate assisted optimization using a suite of low fidelity solvers. Open J. Optim. 1(1), 0–8 (2012)

    Google Scholar 

  46. Zaitsev, A., Burnaev, E., Spokoiny, V.: Properties of the posterior distribution of a regression model based on Gaussian random fields. Autom. Remote Control 74(10), 1645–1655 (2013)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

We thank Dmitry Khominich from DATADVANCE llc for making the solvers for rotating disk problem available, and Tatyana Alenkaya from MIPT for proofreading of the article. The research was conducted in IITP RAS and supported solely by the Russian Science Foundation grant (project 14-50-00150).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. Zaytsev .

Editor information

Editors and Affiliations

Appendices

Appendix

A Proof of Technical Statements

In this section we provide the proofs of the statements of Sect. 4.

Proof (Proof of Statement 1)

For the posterior mean we get:

$$\begin{aligned} \hat{\mathbf {y}}_h(\mathbf {x}^*)&\approx \mathbf {K}_1^* \mathbf {K}_{11}^{-1} \mathbf {K}_1^T (\mathbf {K}_1 \mathbf {K}_{11}^{-1} \mathbf {K}_1^T + \mathbf {R}^{-2})^{-1} \mathbf {y}= \mathbf {K}_1^* \mathbf {K}_{11}^{-1} \mathbf {K}_1^T \mathbf {R}(\mathbf {R}\mathbf {K}_1 \mathbf {K}_{11}^{-1} \mathbf {K}_1^T \mathbf {R}+ \mathbf {I}_{n})^{-1} \mathbf {R}\mathbf {y}=\\&= \mathbf {K}_1^* \mathbf {K}_{11}^{-1} \mathbf {C}_1^T (\mathbf {C}_1 \mathbf {K}_{11}^{-1} \mathbf {C}_1^T + \mathbf {I}_{n})^{-1} \mathbf {R}\mathbf {y}= \mathbf {K}_1^* \mathbf {K}_{11}^{-1} (\mathbf {C}_1^T \mathbf {C}_1 \mathbf {K}_{11}^{-1} + \mathbf {I}_{n_1})^{-1} \mathbf {C}_1^T \mathbf {R}\mathbf {y}= \\&= \mathbf {K}_1^* (\mathbf {C}_1^T \mathbf {C}_1 + \mathbf {K}_{11})^{-1} \mathbf {C}_1^T \mathbf {R}\mathbf {y}= \mathbf {K}_1^* (\mathbf {C}_1^T \mathbf {C}_1 + \mathbf {V}^T_{11} \mathbf {V}_{11})^{-1} \mathbf {C}_1^T \mathbf {R}\mathbf {y}=\\&= \mathbf {K}_1^* \mathbf {V}^{-1}_{11} (\mathbf {V}^{-T}_{11} \mathbf {C}_1^T \mathbf {C}_1 \mathbf {V}_{11}^{-1} + \mathbf {I}_{n_1})^{-1} \mathbf {V}^{-T}_{11} \mathbf {C}_1^T \mathbf {R}\mathbf {y}= \mathbf {K}_1^* \mathbf {V}^{-1}_{11} (\mathbf {I}_{n_1} + \mathbf {V}^T \mathbf {V})^{-1} \mathbf {V}^T \mathbf {R}\mathbf {y}. \end{aligned}$$

We use the same approach to derive an equation for the posterior variance:

$$\begin{aligned} \mathbb {V} \left( X^* \right) - (\rho ^2 \sigma _l^2 + \sigma _d^2) \mathbf {I}_{n^*}&\approx \mathbf {K}_1^* \mathbf {K}_{11}^{-1} \mathbf {K}_1^{*T} - \mathbf {K}_1^* \mathbf {K}_{11}^{-1} \mathbf {K}_1^T (\mathbf {R}^{-2} + \mathbf {K}_1 \mathbf {K}_{11}^{-1} \mathbf {K}_1^T)^{-1} \mathbf {K}_1 \mathbf {K}_{11}^{-1} \mathbf {K}_1^{*T}=\\&= \mathbf {K}_1^* (\mathbf {K}_{11}^{-1} - \mathbf {K}_{11}^{-1} \mathbf {K}_1^T (\mathbf {R}^{-2} + \mathbf {K}_1 \mathbf {K}_{11}^{-1} \mathbf {K}_1^T)^{-1} \mathbf {K}_1 \mathbf {K}_{11}^{-1}) \mathbf {K}_1^{*T}=\\&= \mathbf {K}_1^* (\mathbf {K}_{11} + \mathbf {K}_1^T \mathbf {R}^2 \mathbf {K}_1)^{-1} \mathbf {K}_1^{*T} = \mathbf {K}_1^* (\mathbf {V}^T_{11} \mathbf {V}_{11} + \mathbf {C}_1^T \mathbf {C}_1)^{-1} \mathbf {K}_1^{*T}= \\&= \mathbf {K}_1^* \mathbf {V}^{-1}_{11} (\mathbf {I}_{n_1} + \mathbf {V}^T \mathbf {V})^{-1} \mathbf {V}^{-T}_{11} \mathbf {K}_1^{*T}. \end{aligned}$$

Proof (Proof of Statement 2)

First of all we have to calculate the matrices \(\mathbf {V}_{11}\) and \(\mathbf {V}= \mathbf {R}\mathbf {K}_1 \mathbf {V}_{11}^{-T}\). The matrix \(\mathbf {V}_{11}\) is of size \(n_1 \times n_1\), so we need \(O(n_1^3)\) to get its inverse. To calculate \(\mathbf {K}_1 \mathbf {V}_{11}^{-T}\) we need \(O(n_1^2 n)\) operations. Finally, as \(\mathbf {R}\) is a diagonal matrix, we use \(O(n_1 n)\) operations to get \(\mathbf {V}\).

In case \(n^* = 1\) to get the posterior mean we have to calculate \(\mathbf {V}_{11} (\mathbf {I}_{n_1} + \mathbf {V}^T \mathbf {V})^{-1} \mathbf {V}^T \mathbf {y}\). We use \(O(n_1^2 n)\) operations to calculate \(\mathbf {V}^T \mathbf {V}\), to inverse \(\mathbf {I}_{n_1} + \mathbf {V}^T \mathbf {V}\) we need \(O(n_1^3)\) operations, to calculate \(\mathbf {V}_{11} (\mathbf {I}_{n_1} + \mathbf {V}^T \mathbf {V})^{-1} \mathbf {V}^T\) one uses extra \(O(n_1^2 n)\) operations, and finally to calculate the posterior mean we need additional \(O(n_1 n)\) operations. Consequently, to calculate the posterior mean we use \(O(n_1^2 n)\) operations.

In the same way in order to calculate \(\mathbf {V}_{11} (\mathbf {I}_{n_1} + \mathbf {V}^T \mathbf {V})^{-1} \mathbf {V}_{11}^{-1}\) we need \(O(n_1^2 n)\) operations to calculate \((\mathbf {I}_{n_1} + \mathbf {V}^T \mathbf {V})^{-1}\) and additional \(O(n_1^3)\) operations to get the final matrix. Consequently, in order to calculate the posterior variance we use \(O(n_1^2 n)\) operations.

Finally, we need \(O(n_1^2 n)\) operations to compute the required matrices, and \(O(n_1^2 n)\), to obtain the posterior mean and the posterior variance from these precomputed matrices. So, the total computational complexity is \(O(n_1^2 n)\).

B Comparison of Low and High Fidelity Model for Rotating Disk

There are two available solvers for \(u_\mathrm{max}\) and \(s_\mathrm{max}\) calculation. The low fidelity function is calculated using Ordinary Differential Equations (ODE) solver based on a simple Runge–Kutta’s method. The high fidelity function is calculated using Finite Element Model (FEM) solver from ANSYS.

To compare the solvers we draw the scatter plots of low and high fidelity values and also plot slices of the corresponding functions. We generate a random sample of points in a specified design space box, calculate the low and high fidelity function values and draw the low fidelity function values versus the high fidelity function values at the same points. The scatter plots are in Fig. 3: the difference between values increases significantly when the values are increasing.

Fig. 3.
figure 3

Comparison of the high and the low fidelity solvers via scatter plots

For the central point of the design space box with \(r_1 = 0.06, r_2 = 0.13, r_3 = 0.16, r_4 = 0.185, t_1 = 0.027, t_3 = 0.027\) we construct one-dimensional slices by varying single input variable in specified bounds. Slices for different input variables for \(u_\mathrm{max}\) and for \(s_\mathrm{max}\) are given in Fig. 4. In case of \(u_\mathrm{max}\) the high and the low fidelity functions have the same behaviour, and the low fidelity function models the high fidelity function accurately. For \(s_\mathrm{max}\) the high and the low fidelity functions are sometimes different: their behaviours differ for a slice along \(r_1\) input, and local maxima differ for slice along \(t_3\) input.

Fig. 4.
figure 4

Comparison of the high and the low fidelity solvers via outputs’ slices

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Zaytsev, A. (2016). Variable Fidelity Regression Using Low Fidelity Function Blackbox and Sparsification. In: Gammerman, A., Luo, Z., Vega, J., Vovk, V. (eds) Conformal and Probabilistic Prediction with Applications. COPA 2016. Lecture Notes in Computer Science(), vol 9653. Springer, Cham. https://doi.org/10.1007/978-3-319-33395-3_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-33395-3_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-33394-6

  • Online ISBN: 978-3-319-33395-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics