Abstract
Methods to perform regression on compositional covariates have recently been proposed using isometric log-ratios (ilr) representation of compositional parts. This approach consists of first applying standard regression on ilr coordinates and second, transforming the estimated ilr coefficients into their contrast log-ratio counterparts. This gives easy-to-interpret parameters indicating the relative effect of each compositional part. In this work we present an extension of this framework, where compositional covariate effects are allowed to be smooth in the ilr domain. This is achieved by fitting a smooth function over the multidimensional ilr space, using Bayesian P-splines. Smoothness is achieved by assuming random walk priors on spline coefficients in a hierarchical Bayesian framework. The proposed methodology is applied to spatial data from an ecological survey on a gypsum outcrop located in the Emilia Romagna Region, Italy.
Similar content being viewed by others
References
Aitchison J (1986) The statistical analysis of compositional data. Chapman and Hall, New York
Aitchison J, Bacon-Shone J (1984) Log contrast models for experiments with mixtures. Biometrika 71(2): 323–330 http://www.jstor.org/stable/2336249
Brezger A, Lang S (2006) Generalized structured additive regression based on bayesian P-splines. Comput Stat Data Anal 50(4):967–991. doi:10.1016/j.csda.2004.10.011
Bruno F, Greco F, Ventrucci M (2014) Spatio-temporal regression on compositional covariates: modeling vegetation in a gypsum outcrop. Environ Ecol Stat. doi:10.1007/s10651-014-0305-4
Currie I, Durbán M, Eilers P (2006) Generalized linear array models with applications to multidimensional smoothing. J R Stat Soc B 68:259–280
Di Marzio M, Panzera A, Venieri C (2014) Non-parametric regression for compositional data. Stat Model. doi:10.1177/1471082X14535522
Egozcue J, Pawlowsky-Glahn V, Mateu-Figueras G, Barcel-Vidal C (2003) Isometric logratio transformations for compositional data analysis. Math Geol 35(3):279–300. doi:10.1023/A:1023818214614
Eilers P, Marx B (1996) Flexible smoothing with B-splines and penalties. Stat Sci 11:89–121
Eilers P, Marx B (2010) Splines, knots, and penalties. Wiley Interdiscip Rev Comput Stat 2:637–653
Eilers P, Currie I, Durbán M (2006) Fast and compact smoothing on large multidimensional grids. Comput Stat Data Anal 5:61–76
Fahrmeir L, Kneib T, Lang S (2004) penalized structured additive regression for space-time data: a Bayesian perspective. Stat Sin 14:715–745
Goicoa T, Militino A, Ugarte M (2011) Modelling aboveground tree biomass while achieving the additivity property. Environ Ecol Stat 18(2):367–384. doi:10.1007/s10651-010-0137-9
Goicoa T, Ugarte M, Etxeberria J, Militino A (2012) Comparing car and P-spline models in spatial disease mapping. Environ Ecol Stat 19:537–599
Hron K, Filzmoser P, Thompson K (2012) Linear regression with compositional explanatory variables. J Appl Stat 39(5):1115–1128. doi:10.1080/02664763.2011.644268
Kneib T, Muller J, Hothorn T (2008) Spatial smoothing techniques for the assessment of habitat suitability. Environ Ecol Stat 15:343–364
Lang S, Brezger A (2004) Bayesian P-splines. J Comput Graph Stat 13:183–212
Lee D, Durbán M (2009) Smooth-car mixed models for spatial count data. Comput Stat Data Anal 53:2968–2977
Lee DJ, Durbán M (2011) P-spline anova-type interaction models for spatio-temporal smoothing. Stat Model 11(1):49–69. doi:10.1177/1471082X1001100104
Rue H, Held L (2005) Gaussian Markov random fields. Chapman and Hall-CRC, London
Rue H, Martino S, Chopin N (2009) Approximate Bayesian inference for latent gaussian models using integrated nested laplace approximations (with discussion). J R Stat Soc Ser B 71(2):319–392
Ruppert D, Wand P, Carroll R (2003) Semiparametric regression. Cambridge University Press, New York
Spiegelhalter DJ, Best NG, Carlin BP, Van Der Linde A (2002) Bayesian measures of model complexity and fit (with discussion). J R Stat Soc Ser B Stat Methodol 64(4):583–639. doi:10.1111/1467-9868.00353
Tolosana-Delgado R, van den Boogaart KG (2011) Linear models with compositions in R. In: Pawlowsky-Glahn V, Buccianti A (eds) Compositional data analysis: theory and applications. Wiley, Chichester, pp 356–371
Tolosana-Delgado R VDBK (2013) Regression between compositional data sets. In: Proceedings of the 5th international workshop on compositional data analysis, Statistical Modelling Society, pp 163–176
Ugarte MD, Goicoa T, Militino AF (2010) Spatio-temporal modeling of mortality risks using penalized splines. Environmetrics 21(3–4):270–289. doi:10.1002/env.1011
Ugarte M, Goicoa T, Etxeberria J, Militino A (2012) A P-spline anova type model in space-time disease mapping. Stoch Environ Res Risk Assess 26(6):835–845. doi:10.1007/s00477-012-0570-4
Velli A (2014) Relationships between plant diversity and environmental heterogeneity in rupicolous grasslands on gypsum. The case study of Alysso-Sedion albi (Habitat 6110), Ph.D. Dissertation, University of Bologna
Wood S (2006) Generalized additive models: an introduction with R. Chapman & Hall/CRC, Boca Raton
Acknowledgments
We wish to thank Carlo Ferrari, Giovanna Pezzi and Andrea Velli for introducing us to the problem, providing data and performing data pre-processing. The research work underlying this paper was funded by a FIRB 2012 Grant (Project No. RBFR12URQJ; title: Statistical modeling of environmental phenomena: pollution, meteorology, health and their interactions) for research projects by the Italian Ministry of Education, Universities and Research. We thank two anonymous referees for their suggestions and comments.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Bruno, F., Greco, F. & Ventrucci, M. Non-parametric regression on compositional covariates using Bayesian P-splines. Stat Methods Appl 25, 75–88 (2016). https://doi.org/10.1007/s10260-015-0339-2
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10260-015-0339-2