Parallel MARS Algorithm Based on B-splines

Bakin, Sergey; Hegland, Markus; Osborne, Michael R.

doi:10.1007/PL00022715

Parallel MARS Algorithm Based on B-splines

Published: 04 November 2019

Volume 15, pages 463–484, (2000)
Cite this article

Computational Statistics Aims and scope Submit manuscript

Sergey Bakin¹,
Markus Hegland² &
Michael R. Osborne¹

266 Accesses
7 Citations
Explore all metrics

Summary

We investigate one of the possible ways for improving Friedman’s Multivariate Adaptive Regression Splines (MARS) algorithm designed for flexible modelling of high-dimensional data. In our version of MARS called BMARS we use B-splines instead of truncated power basis functions. The fact that B-splines have compact support allows us to introduce the notion of a “scale” of a basis function. The algorithm starts building up models by using large-scale basis functions and switches over to a smaller scale after the fitting ability of the large scale splines has been exhausted. The process is repeated until the prespecified number of basis functions has been produced. In addition, we discuss a parallelisation of BMARS as well as an application of the algorithm to processing of a large commercial data set. The results demonstrate the computational efficiency of our algorithm and its ability to generate models competitive with those of the original MARS.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Particle swarm optimization algorithm: an overview

Article 17 January 2017

Eighty Years of the Finite Element Method: Birth, Evolution, and Future

Article Open access 13 June 2022

Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC

Article 30 August 2016

Notes

¹In fact, the ratio of the cost of optimising the knots of a B-spline to that of optimising a single knot of a truncated power basis function is roughly 1 : K^r, where K is the total number of candidate knot locations on a particular variable.
²In this paper, the multivariate models which are piecewise linear in each covariate are referred to as piecewise linear models. The same applies to the piecewise quadratic models mentioned in the later sections of the paper.
¹The set of B-splines of the largest scale turns out to be comprised of a single linear function.
¹The optimal value for J_max is recommended in (Friedman 1991) to be set to 2J_final, where J_final is the size of the model after elimination of the suboptimal basis functions (see below). Thus, one generally has to run MARS or BMARS several time to determine the optimal value for J_max.
²The knots of a bivariate tensor product basis function \(T\left( {x,y} \right) = B_{{t_1}}^{{l_1}}\left( x \right)B_{{t_2}}^{{l_2}}\left( y \right)\) are the four corners of its support rectangle (x₁, y₁), (x₁, y₃), (x₃, y₃), (x₃, y₁) as well as the location of its peak (x₂, y₂), where (x₁, x₂, x₃) and (y₁, y₂, y₃) are the knots of the univariate splines \(B_{{t_1}}^{{l_1}}\left( x \right)\) and \(B_{{t_2}}^{{l_2}}\left( y \right)\) respectively.
¹One least squares fit per candidate basis function (13) defined by an admissible triplet j, v, t.

References

Breiman, L., Friedman, J.H., Olshen, R.A. & Stone, C.J. (1984), Classification and Regression Trees, Wadsworth, Belmont, California.
MATH Google Scholar
Chen, Z. (1990), Beyond additive models: interactions by smoothing spline methods, Technical Report SMS-009-90, The Australian National University.
Cox, M.G. (1981), Practical spline approximation, Topics in Numerical Analysis, Lancaster, 79–112.
Google Scholar
Fayyad, U., Piatetsky-Shapiro, G. & Smyth, P. (1996), From Data Mining to Knowledge Discovery: An Overview, in ‘Advances in Knowledge Discovery and Data Mining’, pp. 1–36.
Friedman, J.H. (1991), ‘Multivariate Adaptive Regression Splines’, The Annals of Statistics, 19(1), 1–141.
Article MathSciNet Google Scholar
Friedman, J.H. (1981), Estimating functions of mixed ordinal and categorical variables, Technical Report 108, Stanford University.
Friedman, J.H. & Stuetzle, W. (1981), ‘Projection Pursuit Regression’, Journal of the American Statistical Association, 76, 817–823.
Article MathSciNet Google Scholar
Geist, A., Beguelin, A., Dongarra, J., Jiang, W., Manchek, R. & Sunderam, V. (1994), PVM: Parallel Virtual Machine, MIT Press.
George, E.I. & McCulloch, R.E. (1993), ‘Variable selection via Gibbs sampling’, Journal of the American Statistical Association, 88, 881–889.
Article Google Scholar
Luenberger, D.G. (1984), Linear and Nonlinear Programming, Reading, Massachusetts.
MATH Google Scholar
McCullagh, P. & Neider, J.A. (1983), Generalized Linear Models, Chapman and Hall.
Miller, A.J. (1990), Subset Selection in Regression, Chapman and Hall.
Stone, G. (1997), Analysis of Motor Vehicle Claims Data using Statistical Data Mining, CMIS Confidential Report CMIS-97/73, CSIRO, Australia.
Wahba, G. (1990), Spline Models for Observational Data, SIAM, Philadelphia.
Book Google Scholar

Download references

8 Acknowledgements

We are most grateful to Prof J.H. Friedman for suggesting the idea of the experiment involving the synthetic data set and to Dr B. Turlach for very fruitful discussions. Our thanks are also due to the anonymous referees for their constructive comments which greatly helped to improve the quality of this paper. The research of S. Bakin was supported by the Australian Government (Overseas Postgraduate Research Scholarship), by the Australian National University (ANU PhD Scholarship) and, also, by the Advanced Computational Systems CRC (ACSys), Australia.

Author information

Authors and Affiliations

School of Mathematical Sciences, The Australian National University, Canberra, ACT, 0200, Australia
Sergey Bakin & Michael R. Osborne
Computer Sciences Laboratory, The Australian National University, Canberra, ACT, 0200, Australia
Markus Hegland

Authors

Sergey Bakin
View author publications
You can also search for this author in PubMed Google Scholar
Markus Hegland
View author publications
You can also search for this author in PubMed Google Scholar
Michael R. Osborne
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bakin, S., Hegland, M. & Osborne, M.R. Parallel MARS Algorithm Based on B-splines. Computational Statistics 15, 463–484 (2000). https://doi.org/10.1007/PL00022715

Download citation

Published: 04 November 2019
Issue Date: December 2000
DOI: https://doi.org/10.1007/PL00022715

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Parallel MARS Algorithm Based on B-splines

Summary

Access this article

Similar content being viewed by others

Particle swarm optimization algorithm: an overview

Eighty Years of the Finite Element Method: Birth, Evolution, and Future

Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC

Notes

References

8 Acknowledgements

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Parallel MARS Algorithm Based on B-splines

Summary

Access this article

Similar content being viewed by others

Particle swarm optimization algorithm: an overview

Eighty Years of the Finite Element Method: Birth, Evolution, and Future

Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC

Notes

References

8 Acknowledgements

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation