Abstract
The step of data analysis in a scientific work is not always a friendly universe. Here, I provide suggestions and warn of five pitfalls in a proposal of statistical routine focused on selection of predictor variables for multiple regression—a simple model used to answer questions commonly raised in Vegetation Ecology—and verification of assumptions of this method. I believe that this manuscript will clarify important points in the data analysis process and, therefore, contribute to make studies in Vegetation Ecology more competitive in the international scientific scenario.
Notes
Here, I comment only on spatial independence, but note that there are also the temporal and phylogenetic types of independence (see, for instance, Peres Neto 2006).
Removing outliers requires a well-defined criterion (e.g., Quinn and Keough 2002). Otherwise, you can enter a bias in the significance analysis of relationships between variables.
References
Borcard D, Legendre P (2002) All-scale spatial analysis of ecological data by means of principal coordinates of neighbour matrices. Ecol Model 153:51–68
Burnham KP, Anderson DR (2002) Model selection and multimodel inference. A practical information-theoretical approach. Springer, New York
Callegari-Jacques SM (2003) Bioestatística: Princípios e Aplicações. Artmed, Porto Alegre
Diniz Filho JAF, Bini LM (2005) Modelling geographical patterns in species richness using eigenvector-based spatial filters. Glob Ecol Biogeogr 14:177–185
Diniz Filho JAF, Bini LM, Hawkins BA (2003) Spatial autocorrelation and red herrings in geographical ecology. Glob Ecol Biogeogr 12:53–64
Diniz Filho JAF, Rangel TFLVB, Bini LM (2008) Model selection and information theory in geographical ecology. Glob Ecol Biogeogr 17:479–488
Diwold K, Dullinger S, Dirnböck T (2010) Effect of nitrogen availability on forest understorey cover and its consequences for tree regeneration in the Austrian limestone Alps. Plant Ecol 209:11–22
Felfili JM, Roitman I, Medeiros MM, Sanchez M (2011a) Procedimentos e Métodos de Amostragem de Vegetação. In: Felfili JM, Eisenlohr PV, de Melo MMRF, Andrade LA, Meira Neto JAA (eds) Fitossociologia no Brasil: Métodos e Estudos de Casos, vol 1. Editora UFV, Viçosa, pp 86–121
Felfili JM, Carvalho FA, Libano AM, Venturoli F, Pereira BAS, Machado ELM (2011b) Análise multivariada: princípios e métodos em estudos de vegetação. In: Felfili JM, Eisenlohr PV, de Melo MMRF, Andrade LA, Meira Neto JAA (eds) Fitossociologia no Brasil: Métodos e Estudos de Casos, vol 1. Editora UFV, Viçosa, pp 122–155
Fortin M-J, Dale MRT (2005) Spatial analysis. A guide for ecologists. Cambridge University Press, Cambridge
Fotheringham AS, Brunsdon C, Charlton M (2002) Geographically weighted regression: the analysis of spatially varying relationships. Wiley, Chichester
Gotelli NJ, Ellison AM (2010) Princípios de estatística em ecologia. Artmed, Porto Alegre
Griffith D (2003) Spatial autocorrelation and spatial filtering: gaining understanding through theory and scientific visualization. Springer, Berlin
Kent M (2011) Vegetation description and data analysis. A practical approach. Wiley Blackwell, Chichester
Kupfer JA, Farris CA (2007) Incorporating spatial non-stationarity of regression coefficients into predictive vegetation models. Landsc Ecol 22:837–852
Legendre P, Fortin M-J (1989) Spatial pattern and ecological analysis. Vegetatio 80:107–138
Myers RH (1986) Classical and modern regression with applications. Duxbury Press, Boston
Oliveira Filho AT, Fontes MAL (2000) Patterns of floristic differentiation among Atlantic Forests in southeastern Brazil and the influence of climate. Biotropica 32:793–810
Oliveira Filho AT, Jarenkow JA, Rodal MJN (2006) Floristic relationships of seasonally dry forests of eastern South America based on tree species distribution patterns. In: Pennington RT, Ratter JA, Lewis GP (eds) Neotropical savannas and dry forests: plant diversity, biogeography and conservation. CRC Press, Boca Raton, pp 159–192
Peres Neto PR (2006) A unified strategy for estimating and controlling spatial, temporal and phylogenetic autocorrelation in ecological models. Oecol Bras 10:105–119
Quinn GP, Keough MJ (2002) Experimental design and data analysis for biologists. Cambridge University Press, Melbourne
Santos RM, Oliveira Filho AT, Eisenlohr PV, Queiroz LP, Cardoso DBOS, Rodal MJN (2012) Identity and relationships of the Arboreal Caatinga among other floristic units of seasonally dry tropical forests (SDTFs) of north-eastern and Central Brazil. Ecol Evol 2:409–428
Whittingham MJ, Stephens P, Bradbury RB, Freckleton RP (2006) Why do we still use stepwise modelling in ecology and behaviour? J Anim Ecol 75:1182–1189
Acknowledgments
I thank the two anonymous reviewers for their valuable contributions. I am especially grateful to my students and also to AT Oliveira Filho and MA Cupertino, for encouraging the production of this manuscript.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Eisenlohr, P.V. Challenges in data analysis: pitfalls and suggestions for a statistical routine in Vegetation Ecology. Braz. J. Bot 36, 83–87 (2013). https://doi.org/10.1007/s40415-013-0002-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40415-013-0002-9